Metadata for Semantic and Social Applications

Edited by Jane Greenberg and Wolfgang Klas

Proceedings of the International Conference on Dublin Core and Metadata Applications

Universitätsverlag Göttingen

22-26 September 2008

Jane Greenberg, Wolfgang Klas (Ed.)

This work is licensed under the
Creative Commons License 2.0 “by-nd”, allowing you to download, distribute and print the document in a few copies for private or educational use, given that the document stays unchanged and the creator is mentioned.
You are not allowed to sell copies of the free version.
Published by Universitätsverlag Göttingen 2008

Proceedings of the International Conference on Dublin Core and Metadata Applications Berlin, 22-26 September 2008

DC 2008: Berlin, Germany

Edited by Jane Greenberg and Wolfgang Klas

Published by the Dublin Core Metadata Initiative, Singapore

and

Universitätsverlag Göttingen 2008

Bibliographical information of the German National Library

This publication is recorded by the German National Library; detailed bibliographical data are available here: <http://dnb.ddb.de>

This work is protected by German Intellectual Property Right Law. It is also available as an Open Access version through the publisher’s homepage and the Online Catalogue of the State and University Library of Goettingen (http://www.sub.uni-goettingen.de). Users of the free online version are invited to read, download and distribute it. Users may also print a small number for educational or private use. However they may not sell print versions of the online book.

Typesetting Beate Rabold, Sandra Lechelt, Susanne Dobratz
Humboldt- Universität zu Berlin
Cover Design Mirjam Kessler, German National Library
Manuela Schulze, Humboldt-Universität zu Berlin Margo Bargheer, Universitätsverlag Göttingen
Production Managament Stefanie Rühle
Niedersächs. Staats- und Universitätsbibliothek Göttingen

Preface

Establishing a standard like Dublin Core is like building a bridge. It makes exchange possible. Fittingly, the Dublin Core conference 2008 is taking place in Berlin, which is often called a bridge between Western and Eastern Europe. This event reaches even beyond Europe, with registered participants from nations all over the world such as United States, South Africa, Japan and New Zealand. This year's conference will focus on metadata for social and semantic applications. For the first time, alongside the English tutorials there will be tutorials in German, prepared and presented by students from the University of Applied Sciences Potsdam. After three days of plenary as well as parallel sessions the conference will close on Friday with four different seminars focussing on interoperability and metadata vocabularies. Organizing such a conference would be impossible without the invaluable help of the following six organizations: the Competence Centre for Interoperable Metadata (KIM), the Max Planck Digital Library (MPDL), the Göttingen State and University Library (SUB), the German National Library (DNB), Humboldt-Universität zu Berlin (HU Berlin) and the Dublin Core Metadata Initiative (DCMI). For the funding we would like to thank the German Research Foundation (DFG) and the Federal Ministry of Education and Research (BMBF). Additionally we would like to thank Elsevier, the Common Library Network GBV, IBM, OCLC and Sun Microsystems for their generous sponsoring. Last but not least, the conference is supported by Wikimedia Deutschland, local support community of Wikipedia, the well-known online encyclopedia. We are sure that this year’s conference will serve as a bridge between the participants and their knowledge, ideas and visions. With sincere wishes for a productive conference,

Heike Neuroth
Niedersächsische Staats- und Universitätsbibliothek Göttingen on behalf of the DC-2008 Host Organisation Committee

i

Preface

It is with great pleasure that DCMI welcomes participants to DC-2008, the 8^thannual International Conference on Dublin Core and Metadata Applications to Berlin, Germany. For DCMI, it is also a return to Germany, as we came to Frankfurt in 1999 for one of the last invitational workshops, before the conference cycle began in 2001. A lot has happened since then, most notably the establishment of the Dublin Core Metadata Element Set as ISO standard 15836. Important steps since then include the development of the extended set of DCMI Terms, the DCMI Abstract Model and the Singapore Framework for Dublin Core Application Profiles. Since those days in 1999, the Dublin Core community has grown from a small and committed group of metadata pioneers into a large community of researchers and practitioners, who come together once a year to share experiences, discuss common issues and meet people from across the planet. This year in Berlin, the program has a dual focus, with attention for semantic applications (where the focus is on machine-readable information and co-operation between automated systems) and social applications (where the focus is on co-operation between people). We believe that both forms of co-operation are crucial for enabling the interoperability that is at the heart of our work on Dublin Core metadata.

As usual, we hope that the event in Berlin will help people to gain understanding of approaches and developments in many places around the world, in many application domains and in many languages, and at the same time allow participants to get to know each other and build and extend personal and professional networks. On behalf of DCMI and its many contributors, I would like to wish everybody a very useful and pleasant conference.

Makx Dekkers
Managing Director
Dublin Core Metadata Initiative

ii

Acknowledgements

The DC2008 proceedings represented in the following pages are the end-result of a long process that includes the submission of papers, reports, and posters; reviewing the submissions; organizing the accepted work among themes; and reviewing and formatting the final copies of the accepted works for publication in these proceedings. The process required the input of all members of the Program Committee and the Publications Committee. As conference Co-Chairs, we’d like to thank and acknowledge the efforts of the members of both of these committees, and, in particular, the outstanding and tremendous efforts of Stuart Sutton, Hollie White, Bernhard Haslhofer, Stefanie Rühle, and Susanne Dobratz.

Jane Greenberg, University of North Carolina at Chapel Hill
Wolfgang Klas, Universität Wien
Program Committee Co-Chairs, DC-2008

iii

2008 Proc. Int’l Conf. on Dublin Core and Metadata Applications

Introduction

The 2008 International Conference on Dublin Core and Metadata Applications (DC-2008) is the sixteenth Dublin Core workshop, and the eighth full conference program to include peerreviewed scholarly works (Tokyo, 2001; Florence, 2002; Seattle, 2003; Shanghai, 2004; Madrid, 2005; Manzanillo, 2006; and Singapore, 2007). DC-2008 takes place in Berlin, Germany, a vibrant city in which cultural and scientific ideas are exchanged daily among the many sectors of society. Home to some of the world’s most significant libraries and scientific research centers, Berlin is an ideal location for DC-2008, and for further linking the community of researchers, information professionals, and citizens who increasingly work with metadata to support the preservation, discovery, access, use, and re-use of digital information and information associated with physical artifacts. The theme for DC-2008 is “Metadata for Semantic and Social Applications”. Standardized, schema-driven-metadata underlies digital libraries, data repositories, and semantic applications leading toward the Semantic Web. Metadata is also part of the fabric of social computing, which includes the use of wikis, blogs, and tagging. These two trends flow together in applications such as Wikipedia, where authors collectively create structured information that can be extracted and used to enhance access to and use of information sources. The papers in these proceedings address an array of significant metadata issues and questions related to metadata for semantic and social applications. The proceedings include twelve papers that are organized among the following five themes: 1. Dublin Core: Innovation and Moving Forward; 2. Semantic Integration, Linking, and KOS Methods; 3. Metadata Generation: Methods, Profiles, and Models; 4. Metadata Quality; and 5. Tagging and Metadata for Social Networking. The proceedings also include eight reports distributed among the following three themes: 1. Toward the Semantic Web, 2. Metadata Scheme Design, Application, and Use; and 3. Vocabulary Integration and Interoperability. The last part of the proceedings includes twelve extended one-page abstracts capturing key aspects of current research activities. These papers, reports, and poster abstracts present a cross-section of developments in the field of metadata, with particular attention given to several of the most pressing challenges and important successes in the area of semantic and social systems. Their publication serves as a record of the times and provides a permanent body of knowledge upon which we can build over time. We are pleased to have representation of such high quality work and to have had the input and review of an outstanding Program Committee in making the selection for this year’s conference. We are also pleased that the DC-2008 is taking place in Berlin, a city of international culture. Finally, we are honored to have had the opportunity to serve as this year’s Program Committee Co-Chairs and bring you a fine collection of work from our colleagues around the world.

Jane Greenberg, University of North Carolina at Chapel Hill
Wolfgang Klas, Universität Wien
Program Committee Co-Chairs, 2008

iv

Conference Organization

Conference Coordinators

Makx Dekkers, Dublin Core Metadata Initiative Heike Neuroth, Göttingen State University Library/ Max Planck Digital Library Germany

Program Committee Co-Chairs

Jane Greenberg, School of Information and Library Science, University of North Carolina, USA/ SILS Metadata Research Center Wolfgang Klas, Multimedia Information Systems Group, University of Vienna, Austria

Program Committee

Abdus Sattar Chaudhy, Nanyang Technological University, Singapore Aida Slavic, UDC Consortium, The Hague, The Netherlands Alistair Miles Rutherford, Appleton Laboratory, UK Allyson Carlyle, University of Washington, USA Ana Alice, Baptista University of Minho, Portugal Andrew Wilson, National Archives of Australia, AUS Andy Powell, Eduserv Foundation, UK Ann Apps, The University of Manchester, UK Bernhard Haslhofer, University of Vienna, Austria Bernhard Schandl, University of Vienna, Austria Bradley Paul Allen, Siderean Software, Inc., USA Charles McCathie, Nevile Opera, Norway Chris Bizer, Freie Universität Berlin (FU Berlin), Germany Chris Khoo, Nanyang Technological University, Singapore Cristina Pattuelli, Pratt Institute, USA Corey Harper, New York University, USA Dean Kraft, Cornell University Library, USA Diane Ileana Hillmann, Cornell University Library, USA Dion Goh, Nanyang Technological University, Singapore Eric Childress, OCLC, USA Erik Duval, Dept. Computerwetenschappen, Katholieke Universiteit Leuven, Belgium Eva Mendez, University Carlos III of Madrid, Spain Filiberto F. Martinez-Arellano, National Autonomous University of Mexico, Mexico Gail Hodge, Information International Associates, USA Hollie White, University of North Carolina, Chapel Hill, USA Igor Perisic, LinkedIn, USA Jacques Ducloy, Institut de l’Information Scientifique et Technique, France Jane Hunter, University of Queensland, AUS Jian Qin, Syracuse University, USA John Kunze; California Digital Library, USA Joseph A. Busch, Taxonomy Strategies LLC, USA Joseph Tennis, University of Washington, USA Juha Hakala, Helsinki University Library - The National Library of Finland, Finland Kathy Wisser, University of North Carolina, Chapel Hill, USA Leif Andresen, Danish Library Agency, Denmark Liddy Nevile, Dep. of Computer Science & Computer Engineering, La Trobe University, AUS Marcy Lei Zeng, Kent State University, USA Michael Crandall, University of Washington, USA

v
Miguel-Angel Sicilla, University of Alcala, Spain Mikael Nilsson, Royal Institute of Technology, Sweden Mitsuharu Nagamori, University of Tsukuba, Japan Paul Miller, Talis, UK Pete Johnston, Eduserv Foundation, UK Sandy Roe, Illinois State University, USA Sebastian R. Kruk, DERI Galway, Ireland Shigeo Sugimoto, University of Tsukuba, Japan Stefanie Rühle, Research & Development Department, Göttingen State University Library, GERMANY Stuart A. Sutton, University of Washington, USA Stuart Weibel, OCLC, USA Tom Baker, Research & Development Department, Göttingen State University Library, Germany Traugott Koch, Max Planck Digital Library, Germany Wei Liu, Shanghai Library, China William E. Moen, University of North Texas (UNT), USA

Workshop Committee

Makx Dekkers, Dublin Core Metadata Initiative

Tutorial Committee

Achim Oßwald, Institute of Information Science, University of Applied Sciences, Germany John Roberts, Archives Management, New Zealand

Publications Committee

Susanne Dobratz, (Co-Chair), Humboldt-Universität zu Berlin, Germany Stefanie Rühle, (Co-Chair), Göttingen State University Library, Germany Patrick Danowski, WikiMedia, Germany Sandra Lechelt, Humboldt-Universität zu Berlin, Germany Beate Rabold, Humboldt-Universität zu Berlin, Germany Hollie White, University of North Carolina, Chapel Hill, USA

Publicity Committee

Mirjam Keßler (Chair), German National Library Frankfurt, Germany Monika Nisslein, Max Planck Digital Library, Germany Cornelia Reindl, Max Planck Digital Library, Germany

Host Organisation Committee

Laurent Romary, Max Planck Digital Library, Germany Michael Seadle, Humboldt-Universität zu Berlin, Germany Reinhard Altenhöner, German National Library Frankfurt, Germany Stefanie Rühle, Göttingen State University Library, Germany

assisted by

Peggy Beßler; Malte Dreyer; Susanne Dobratz; Stefan Farrenkopf; Christine Frodl; Katrin Gashi; Elke Greifeneder; Justine Haeberli; Rupert Kiefl; Traugott Koch; Matthias Schulz; Ulla Tschida; Andre Wobst

vi

CONTENTS

PAPER SESSION 1 DUBLIN CORE: INNOVATION AND MOVING FORWARD

Encoding Application Profiles in a Computational Model of the Crosswalk .................................3

Carol Jean Godby, Devon Smith & Eric Childress

Relating Folksonomies with Dublin Core .....................................................................................14

Maria Elisabete Catarino & Ana Alice Baptista

PAPER SESSION 2 SEMANTIC INTEGRATION, LINKING, AND KOS METHODS

LCSH, SKOS and Linked Data .................................................................................................... 25

Ed Summers, Antoine Isaac, Clay Redding & Dan Krech

Theme Creation for Digital Collections ........................................................................................34

Xia Lin, Jiexun Li & Xiaohua Zhou

Comparing Human and Automatic Thesaurus Mapping Approaches in the Agricultural Domain ......................................................................................................................43

Boris Lauser, Gudrun Johannsen, Caterina Caracciolo, Johannes Keizer, Willem Robert van Hage & Philipp Mayr

PAPER SESSION 3 METADATA GENERATION:

METHODS, PROFILES, AND MODELS

Automatic Metadata Extraction from Museum Specimen Labels ................................................57

P. Bryan Heidorn & Qin Wei

Achievement Standards Network (ASN): An Application Profile for Mapping K-12 Educational Resources to Achievement Standards ..............................................................69

Stuart A. Sutton & Diny Golder

Collection/Item Metadata Relationships .......................................................................................80

Allen H. Renear, Richard J. Urban, Karen M. Wickett, David Dubin & Sarah L. Shreeves

PAPER SESSION 4 METADATA QUALITY

Answering the Call for more Accountability: Applying Data Profiling to Museum Metadata ....93

Seth van Hooland, Yves Bontemps & Seth Kaufman

A Conceptual Framework for Metadata Quality Assessment .....................................................104

Thomas Margaritopoulos, Merkourios Margaritopoulos, Ioannis Mavridis & Athanasios Manitsaris

PAPER SESSION 5 TAGGING AND METADATA FOR SOCIAL NETWORKING

Semantic Relation Extraction from Socially-Generated Tags: A Methodology for Metadata Generation .................................................................................................................................. 117

Miao Chen, Xiaozhong Liu & Jian Qin

The State of the Art in Tag Ontologies: A Semantic Model for Tagging and Folksonomies ... 128

Hak Lae Kim, Simon Scerri, John G. Breslin, Stefan Decker & Hong Gee Kim

PROJECT REPORT SESSION 1 TOWARD THE SEMANTIC WEB

DCMF: DC & Microformats, a Good Marriage ......................................................................... 141

Eva Méndez, Leandro M. López, Arnau Siches & Alejandro G. Bravo

Making a Library Catalogue Part of the Semantic Web ............................................................. 146

Martin Malmsten

PROJECT REPORT SESSION 2 METADATA SCHEME DESIGN, APPLICATION, AND USE

The Dryad Data Repository: A Singapore Framework Metadata Architecture in a DSpace Environment ............................................................................................................................... 157

Hollie C. White, Sarah Carrier, Abbey Thompson, Jane Greenberg & Ryan Scherle

Applying DCMI Elements to Digital Images and Text in the Archimedes Palimpsest Program .................................................................................................................... 163

Michael B. Toth & Doug Emery

Assessing Descriptive Substance in Free-Text Collection-Level Metadata................................ 169

Oksana Zavalina, Carole L. Palmer, Amy S. Jackson & Myung-Ja Han

PROJECT REPORT SESSION 3 VOCABULARY INTEGRATION AND INTEROPERABILITY

Building a Terminology Network for Search: The KoMoHe project.......................................... 177

Philipp Mayr & Vivien Petras

Cool URIs for the DDC: Towards Web-Scale Accessibility of a Large Classification System.. 183

Michael Panzer

The Specification of the Language of the Field and Interoperability: Cross-language Access to Catalogues and Online Libraries (CACAO) .................................... 191

Barbara Levergood, Stefan Farrenkopf & Elisabeth Frasnelli

POSTER ABSTRACTS

Implementation of Rich Metadata Formats and Semantic Tools using DSpace ........................ 199

Imma Subirats, ARD Prasad, Johannes Keizer & Andrew Bagdanov

SKOS for an Integrated Vocabulary Structure.............................................................................200

Marcia L. Zeng, Wei Fan & Xia Lin

Exploring Evolutionary Biologists’ Use and Perceptions of Semantic Metadata for Data Curation ........................................................................................202

Hollie C. White

LCSH is to Thesaurus as Doorbell is to Mammal: Visualizing Structural Problems in the Library of Congress Subject Headings ............................................................................. 203

Simon Spero

Metadata in an Ecosystem of Presentation Dissemination...........................................................204

R. John Robertson, Phil Barker & Mahendra Mahey

A Comparison of Social Tagging Designs and User Participation ..............................................205

Caitlin M. Bentley & Patrick R. Labelle

The Data Documentation Initiative (DDI) .................................................................................. 206

Joachim Wackerow

junii2 and AIRway - an Application Profile for Scholarly Works and Its Application for Link Resolvers........................................................................................................................207

Kunie Horikoshi, Yuji Nonaka, Satsuki Kamiya, Shigeki Sugita, Haruo Asoshina & Izumi Sugita

Open Identification and Linking of the Four Ws .........................................................................208

Ryan Shaw & Michael Buckland

Web 2.0 Semantic Systems: Collaborative Learning in Science ................................................ 209

Michael Shoffner, Jane Greenberg, Jacob Kramer-Duffield & David Woodbury

Doing the LibraryThing™ in an Academic Library Catalog .......................................................211

Christine DeZelar-Tiedman

Applying DC to Institutional Data Repositories ..........................................................................212

Robin Rice

AUTHOR INDEX......................................................................................................... 213 SUBJECT INDEX ....................................................................................................... 215

Proc. Int’l Conf. on Dublin Core and Metadata Applications 2008

Full Papers Session 1:

Dublin Core: Innovation and
Moving Forward

1

2008 Proc. Int’l Conf. on Dublin Core and Metadata Applications