Data Repositories in the Humanities and the Semantic Web: Modelling, Linking, Visualising
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
What Is Digital Humanities?” Essays Like This One Are Already Genre Pieces
ADE BULL E TIN ◆ NUM be R 150, 2010 55 What Is Digital Humanities and What’s It Doing in English Departments? Matthew G. Kirschenbaum The author is associate People who say that the last battles of the computer revolution in En glish departments have been professor of English and fought and won don’t know what they’re talking about. If our current use of computers in En glish associate director of the studies is marked by any common theme at all, it is experimentation at the most basic level. As a pro- Maryland Institute for fession, we are just learning how to live with computers, just beginning to integrate these machines Technology in the Hu- effectively into writing- and reading- intensive courses, just starting to consider the implications of the manities at the University multilayered literacy associated with computers. of Maryland. A version of —Cynthia Selfe this article was presented at the 2010 ADE Sum- WHAT is (or are) the “digital humanities,” aka “humanities computing”? It’s tempt- mer Seminar East in ing to say that whoever asks the question has not gone looking very hard for an Adelphi, Maryland. answer. “What is digital humanities?” essays like this one are already genre pieces. Willard McCarty has been contributing papers on the subject for years (a mono- graph too). Under the earlier appellation, John Unsworth has advised us “what is humanities computing and what is not.” Most recently Patrik Svensson has been publishing a series of well- documented articles on multiple aspects of the topic, including the lexical shift from humanities computing to digital humanities. -
Using the Semantic Web in Digital Humanities
Semantic Web 0 (0) 1 1 IOS Press 1 1 2 2 3 3 4 Using the Semantic Web in Digital 4 5 5 6 Humanities: Shift from Data Publishing to 6 7 7 8 8 9 Data-analysis and Serendipitous Knowledge 9 10 10 11 Discovery 11 12 12 13 Eero Hyvönen 13 14 University of Helsinki, Helsinki Centre for Digital Humanities (HELDIG), Finland and 14 15 Aalto University, Department of Computer Science, Finland 15 16 Semantic Computing Research Group (SeCo) (http://seco.cs.aalto.fi) 16 17 E-mail: eero.hyvonen@aalto.fi 17 18 18 19 19 Editors: Pascal Hitzler, Kansas State University, Manhattan, KS, USA; Krzysztof Janowicz, University of California, Santa Barbara, USA 20 Solicited reviews: Rafael Goncalves, Stanford University, CA, USA; Peter Haase, metaphacts GmbH, Walldorf, Germany; One anonymous 20 21 reviewer 21 22 22 23 23 24 24 25 25 26 Abstract. This paper discusses a shift of focus in research on Cultural Heritage semantic portals, based on Linked Data, and 26 27 envisions and proposes new directions of research. Three generations of portals are identified: Ten years ago the research focus 27 28 in semantic portal development was on data harmonization, aggregation, search, and browsing (“first generation systems”). 28 29 At the moment, the rise of Digital Humanities research has started to shift the focus to providing the user with integrated 29 30 tools for solving research problems in interactive ways (“second generation systems”). This paper envisions and argues that the 30 next step ahead to “third generation systems” is based on Artificial Intelligence: future portals not only provide tools for the 31 31 human to solve problems but are used for finding research problems in the first place, for addressing them, and even for solving 32 32 them automatically under the constraints set by the human researcher. -
Transformation Frameworks and Their Relevance in Universal Design
Universal Access in the Information Society manuscript No. (will be inserted by the editor) Transformation frameworks and their relevance in universal design Silas S. Brown and Peter Robinson University of Cambridge Computer Laboratory 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK e-mail: fSilas.Brown,[email protected] Received: date / Revised version: date Category: Long Paper – Some algorithms can be simplified if the data is first transformed into a convenient structure. Many algo- Key words notations, transformation, conversion, ed- rithms can be regarded as transformations in their ucation, tools, 4DML own right. Abstract Music, engineering, mathematics, and many – Transformation can be important when presenting other disciplines have established notations for writing data to the user and when accepting user input. their documents. Adjusting these notations can con- This last point, namely the importance of transfor- tribute to universal access by helping to address access mation in user interaction, is relevant to universal design difficulties such as disabilities, cultural backgrounds, or and will be elaborated here. restrictive hardware. Tools that support the program- ming of such transformations can also assist by allowing the creation of new notations on demand, which is an 1.1 Transformation in universal design under-explored option in the relief of educational diffi- culties. Universal design aims to develop “technologies which This paper reviews some programming tools that can are accessible and usable by all citizens. thus avoid- be used to effect such transformations. It also introduces ing the need for a posteriori adaptations or specialised a tool, called “4DML”, that allows the programmer to design” [37]. -
Introducing Dbpedia Spotlight and Widenet for Digital Humanities
Semantic Corpus Exploration: Introducing DBpedia Spotlight and WideNet for Digital Humanities This tutorial provides a hands-on experience with semantic annotations for selecting text sources from large corpora. While many humanist scholars are familiar with annotation during the analysis of their selected source texts, we invite participants to discover the utility of semantic annotations to identify documents that are relevant to their research question. Introduction Digitization efforts by libraries and archives have greatly expanded the potential of historical and diachronic corpora to be used as research objects. Whereas paper-based research favors a selection of sources that are known a priori to be relevant, the digitization of sources has opened up ways to find and identify source texts that are relevant beyond the “usual suspects.” Full-text search has been the primary means to retrieve and select a suitable set of sources, e.g. as a research corpus for close reading. Achieving a balanced and defensible selection of sources, however, can be highly challenging when delimiting sources with keyword-based queries. It, too often, requires scholars to handcraft elaborate queries which incorporate long synonym lists and enumerations of named entities (Huistra and Mellink, 2016). In this tutorial, we instruct participants in the use of semantic technology to bridge the gap between the conceptually-based information needs of scholars, and the term-based indices of traditional information retrieval systems. Since we address working with corpora of a scale that defies manual annotation, we will focus on automated semantic annotation, specifically on (named) entity linking technology. Entity Linking with Spotlight We start the tutorial with a practical introduction to entity linking, using the open source DBpedia Spotlight software. -
The Text Encoding Initiative Nancy Ide
Encoding standards for large text resources: The Text Encoding Initiative Nancy Ide LaboratoirE Parole et Langage Department of Computer Science CNRS/Universitd de Provence Vassar College 29, Avenue R.obert Schuman Poughkeepsie, New York 12601 (U.S.A.) 13621 Aix-en-Provence Cedex 1 (France) e-maih ide@cs, vassar, edu Abstract. The Text Encoding Initiative (TEl) is an considerable work to develop adequately large and international project established in 1988 to develop appropriately constituted textual resources still remains. guidelines for the preparation and interchange of The demand for extensive reusability of large text electronic texts for research, and to satisfy a broad range collections in turn requires the development of of uses by the language industries more generally. The standardized encoding formats for this data. It is no longer need for standardized encoding practices has become realistic to distribute data in ad hoc formats, since the inxreasingly critical as the need to use and, most eflbrt and resources required to clean tip and reformat the importantly, reuse vast amounts of electronic text has data for local use is at best costly, and in many cases dramatically increased for both research and industry, in prohibitive. Because much existing and potentially particular for natural language processing. In January available data was originally formatted R)r the purposes 1994, the TEl isstled its Guidelines for the Fmcoding and of printing, the information explicitly represented in the hiterehange of Machine-Readable Texts, which provide encoding concerns a imrticular physical realization of a standardized encoding conventions for a large range of text rather than its logical strttcture (which is of greater text types and features relevant for a broad range of interest for most NLP applications), and the applications. -
Zen and the Art of Linked Data: New Strategies for a Semantic Web of Humanist Knowledge Dominic Oldman, Martin Doerr, and Stefan Gradmann
18 Zen and the Art of Linked Data: New Strategies for a Semantic Web of Humanist Knowledge Dominic Oldman, Martin Doerr, and Stefan Gradmann Meaning cannot be counted, even as it can be counted upon, so meaning has become marginalized in an informational culture, even though this implies that a judgment – that is, an assignment of meaning – has been laid upon it. Meaning lives in the same modern jail which houses the soul, the self, the ego, that entire range of things which assert their existence continually but unreasonably. (Pesce, 1999) This chapter discusses the Semantic Web and its most commonly associated cog- wheel, Linked Data. Linked Data is a method of publishing and enabling the con- nection of data, while the Semantic Web is more broadly about the meaning of this information and therefore the significance and context of the connections. They are often thought of as being synonymous, but the use of Linked Data in practice reveals clear differences in the extent to which the Semantic Web is real- ized both in terms of expressing sufficient meaning (not just to support scholarly activity but also interesting engagement) and implementing specific strategies (Berners‐Lee et al ., 2001). These differences, particularly reflected in the approaches and outputs of different communities and disciplines operating within the humanities, bring to the fore the current issues of using Semantic technologies when working with humanities corpora and their digital representations. They also reflect more deep‐seated tensions in the digital humanities that, particularly in the Open Data world, impede the formation of coherent and progressive strategies, and arguably damage its interdisciplinary objec- tives. -
Debates in the Digital Humanities This Page Intentionally Left Blank DEBATES in the DIGITAL HUMANITIES
debates in the digital humanities This page intentionally left blank DEBATES IN THE DIGITAL HUMANITIES Matthew K. Gold editor University of Minnesota Press Minneapolis London Chapter 1 was previously published as “What Is Digital Humanities and What’s It Doing in English Departments?” ADE Bulletin, no. 150 (2010): 55– 61. Chap- ter 2 was previously published as “The Humanities, Done Digitally,” The Chron- icle of Higher Education, May 8, 2011. Chapter 17 was previously published as “You Work at Brown. What Do You Teach?” in #alt- academy, Bethany Nowviskie, ed. (New York: MediaCommons, 2011). Chapter 28 was previously published as “Humanities 2.0: Promises, Perils, Predictions,” PMLA 123, no. 3 (May 2008): 707– 17. Copyright 2012 by the Regents of the University of Minnesota all rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Published by the University of Minnesota Press 111 Third Avenue South, Suite 290 Minneapolis, MN 55401- 2520 http://www.upress.umn.edu library of congress cataloging-in-publication data Debates in the digital humanities / [edited by] Matthew K. Gold. ISBN 978-0-8166-7794-8 (hardback)—ISBN 978-0-8166-7795-5 (pb) 1. Humanities—Study and teaching (Higher)—Data processing. 2. Humanities —Technological innovations. 3. Digital media. I. Gold, Matthew K.. AZ182.D44 2012 001.3071—dc23 2011044236 Printed in the United States of America on acid- free paper The University of Minnesota is an equal- opportunity educator and employer. -
Four Valences of a Digital Humanities Informed Writing Analytics Gregory J
Research Note Transforming Text: Four Valences of a Digital Humanities Informed Writing Analytics Gregory J. Palermo, Northeastern University Structured Abstract • Aim: This research note narrates existing and continuing potential crossover between the digital humanities and writing studies. I identify synergies between the two fields’ methodologies and categorize current research in terms of four permutations, or “valences,” of the phrase “writing analytics.” These valences include analytics of writing, writing of analytics, writing as analytics, and analytics as writing. I bring recent work in the two fields together under these common labels, with the goal of building strategic alliances between them rather than to delimit or be comprehensive. I offer the valences as one heuristic for establishing connections and distinctions between two fields engaged in complementary work without firm or definitive discursive borders. Writing analytics might provide a disciplinary ground that incorporates and coheres work from these different domains. I further hope to locate the areas in which my current research in digital humanities, grounded in archival studies, might most shape writing analytics. • Problem Formation: Digital humanities and writing studies are two fields in which scholars are performing massive data analysis research projects, including those in which data are writing or metadata that accompanies writing. There is an emerging environment in the Modern Language Association friendly to crossover between the humanities and writing studies, especially in work that involves digital methods and media. Writing analytics Journal of Writing Analytics Vol. 1 | 2017 311 DOI: 10.37514/JWA-J.2017.1.1.11 Gregory J. Palermo accordingly hopes to find common disciplinary ground with digital humanities, with the goal of benefitting from and contributing to conversations about the ethical application of digital methods to its research questions. -
SGML As a Framework for Digital Preservation and Access. INSTITUTION Commission on Preservation and Access, Washington, DC
DOCUMENT RESUME ED 417 748 IR 056 976 AUTHOR Coleman, James; Willis, Don TITLE SGML as a Framework for Digital Preservation and Access. INSTITUTION Commission on Preservation and Access, Washington, DC. ISBN ISBN-1-887334-54-8 PUB DATE 1997-07-00 NOTE 55p. AVAILABLE FROM Commission on Preservation and Access, A Program of the Council on Library and Information Resources, 1400 16th Street, NW, Suite 740, Washington, DC 20036-2217 ($20). PUB TYPE Reports Evaluative (142) EDRS PRICE MF01/PC03 Plus Postage. DESCRIPTORS *Access to Information; Computer Oriented Programs; *Electronic Libraries; *Information Retrieval; Library Automation; Online Catalogs; *Preservation; Standards IDENTIFIERS Digital Technology; *SGML ABSTRACT This report explores the suitability of Standard Generalized Markup Language (SGML) as a framework for building, managing, and providing access to digital libraries, with special emphasis on preservation and access issues. SGML is an international standard (ISO 8879) designed to promote text interchange. It is used to define markup languages, which can then encode the logical structure and content of any so-defined document. The connection between SGML and the traditional concerns of preservation and access may not be immediately apparent, but the use of descriptive markup tools such as SGML is crucial to the quality and long-term accessibility of digitized materials. Beginning with a general exploration of digital formats for preservation and access, the report provides a staged technical tutorial on the features and uses of SGML. The tutorial covers SGML and related standards, SGML Document Type Definitions in current use, and related projects now under development. A tiered metadata model is described that could incorporate SGML along with other standards to facilitate discovery and retrieval of digital documents. -
Status Report from AAT-Taiwan, Chen and Lu, 2017
A Status Report from AAT-Taiwan Dances with the AAT: Moving Vocabularies into the Semantic Web Sophy S. J. Chen, Lu-Yen Lu Academia Sinica Digital Center l International Terminology Working Group (ITWG) Meeting Los Angeles, USA. 11-18 Aug. 2017 Working Status –Updated status of AAT-TW Translation –The Issues LOD Projects –TaiUC LOD Project with Getty Vocabularies –Semantic Annotation of the Photo Collection with Getty Vocabularies –Digital Humanities Project with Getty Vocabularies: Buddhist Art History Works on AAT-Taiwan in 2016/ 2017 (Sep. 2016- Aug. 2017) Contribution of 5,000 translated Concepts • Verification and Contribution of 5,000 records to AAT (Apr.-Aug. 2017). Now 20,359 records to Getty (57% from the whole 35,485 records). Example: AAT 3000009966 fleur-de-lis (motif) Works on AAT-Taiwan in 2016/ 2017 (Sep. 2016- Aug. 2017) The Facet Distributions of AAT-Taiwan Contribution • Details of the AAT-TW contribution Facets 1-7th contribution Record Type Records Percentage Concepts Percentage Concepts 19,189 94.2% Associated Concepts 1,157 5.7% Guide Terms 1,159 5.7% Physical Attributes 967 4.7% Hierarchy Names 11 0.1% Styles and Periods 3,488 18.0% Total 20,359 100% Agents 1,738 8.5% Activities 1,550 7.6% Materials 2,238 12.0% Objects 8866 43.5% Total 20,359 100% Works on AAT-Taiwan in 2016/ 2017 (Sep. 2016- Aug. 2017) 3, 000 selected AAT-TW concepts for developing the Faceted Thesaurus of Chinese Cultural Heritages (FTCCH) • Cooperation with Dr. Claire Huang, University of Science and Technology Beijing, China (Jan.-Mar. -
A SHORT GUIDE to the DIGITAL HUMANITIES
This final section of Digital_Humanities reflects on the preceding chapters, but can also stand alone as a concise overview of the field. As digital methodologies, tools, and skills become increasingly central to work in the humanities, questions regarding fundamentals, project outcomes, assessment, and design have become urgent. The specifications provide a set of checklists to guide those who do work in the Digital Humanities, as well as those who are asked to assess and fund Digital Humanities scholars, projects, and initiatives. A SHORT GUIDE TO THE DIGITAL_HUMANITIES QUESTIONS & ANSWERS DIGITAL HUMANITIES FUNDAMENTALS THE PROJECT AS BASIC UNIT INSTITUTIONS AND PRAGMATICS SPECIFICATIONS HOW TO EVALUATE DIGITAL SCHOLARSHIP PROJECT-BASED SCHOLARSHIP CORE COMPETENCIES IN PROCESSES AND METHODS LEARNING OUTCOMES FOR THE DIGITAL HUMANITIES CREATING ADVOCACY QUESTIONS & ANSWERS 1 What defines the Digital Humanities now? DIGITAL HUMANITIES The computational era has been underway since World War II, but after the advent of personal com- FUNDAMENTALS puting, the World Wide Web, mobile communication, and social media, the digital revolution entered a What is the Digital Humanities? SG2 SG new phase, giving rise to a vastly expanded, global- Digital Humanities refers to new modes of scholar- ized public sphere and to transformed possibilities ship and institutional units for collaborative, trans- for knowledge creation and dissemination. disciplinary, and computationally engaged research, Building on the first generation of computational teaching, and publication. humanities work, more recent Digital Humanities Digital Humanities is less a unified field than an activity seeks to revitalize liberal arts traditions array of convergent practices that explore a universe in the electronically inflected language of the 21st in which print is no longer the primary medium in century: a language in which, uprooted from its long- which knowledge is produced and disseminated. -
A Standard Format Proposal for Hierarchical Analyses and Representations
A standard format proposal for hierarchical analyses and representations David Rizo Alan Marsden Department of Software and Computing Lancaster Institute for the Contemporary Arts, Systems, University of Alicante Lancaster University Instituto Superior de Enseñanzas Artísticas de la Bailrigg Comunidad Valenciana Lancaster LA1 4YW, UK Ctra. San Vicente s/n. E-03690 [email protected] San Vicente del Raspeig, Alicante, Spain [email protected] ABSTRACT portant place in historical research, in pedagogy, and even in such In the realm of digital musicology, standardizations efforts to date humble publications as concert programme notes. have mostly concentrated on the representation of music. Anal- Standard text is generally a crucial part of music analyses, often yses of music are increasingly being generated or communicated augmented by specialist terminology, and frequently accompanied by digital means. We demonstrate that the same arguments for the by some form of diagram or other graphical representation. There desirability of standardization in the representation of music apply are well established ways of representing texts in digital form, but also to the representation of analyses of music: proper preservation, not for the representation of analytical diagrams. In some cases sharing of data, and facilitation of digital processing. We concen- the diagrams are very like music notation, with just the addition of trate here on analyses which can be described as hierarchical and textual labels, brackets or a few other symbols (see Figure 1). In show that this covers a broad range of existing analytical formats. other cases they are more elaborate, and might not make direct use We propose an extension of MEI (Music Encoding Initiative) to al- of musical notation at all.