Integrating Metadata Standards to Support Long-Term

Integrating Metadata Standards to Support Long-Term

Proceedings October 5-6, 2009 Mission Bay Conference Center San Francisco, California Integrating Metadata Standards to Support Long-Term Preservation of Digital Assets: Developing Best Practices for Expressing Preservation Metadata in a Container Format Rebecca Guenther Senior Networking and Standards Specialist Library of Congress, Network Development and MARC Standards Office Washington, DC 20540 [email protected] Robert Wolfe Metadata Specialist, MIT Libraries 77 Massachusetts Avenue, Cambridge, MA 02152 [email protected] Abstract This paper explores the purpose and development of best practice guidelines for the use of preservation metadata as Introduction detailed in the PREMIS Data Dictionary for Preservation The challenge and urgency of preserving born digital Metadata within documents conforming to the Metadata Encoding and Transmission Standard (METS). METS is and digitized information has become a great concern of an XML schema that provides a container format all institutions responsible for maintaining the wide integrating various forms of metadata with digital objects variety of documentation of human knowledge. or links to digital objects. Because of the flexibility of Although there are clear advantages of digital over analog METS to serve many different functions within digital media, digital assets risk becoming technically obsolete. systems and to support many different metadata structures, Recording key pieces of information about these assets is integration guidelines will facilitate common practices imperative upon digital repositories that hope to preserve among institutions. There is constant tension between them over time. The PREMIS Data Dictionary for tighter control over the METS package to support object Preservation Metadata specifies the information that a exchange versus each implementation's unique repository needs to maintain for the long-term preservation metadata requirements given the different contexts and implementation models among PREMIS preservation of digital objects. PREMIS itself is a list of implementers. The PREMIS in METS Guidelines serve data elements (in the Data Dictionary referred to as primarily as a standard for submission and dissemination “semantic units") with definitions, examples, creation information packages. This paper details the issues notes and usage guidelines. It is neutral in terms of the encountered in using the standards together, and how the type of system, database or encoding format that METS document changes as events pertaining to the implements it. Because many institutions managing lifecycle of digital assets are recorded for future digital objects and their metadata use the Metadata preservation purposes. The guidelines have enabled the Encoding and Transmission Standard (METS) in digital implementation of an exchange format and library applications as a container format, this standard is creation/validation tools based on the PREMIS in METS an obvious option as an implementation path. Since an guidelines. important goal is the exchange of objects along with their associated metadata between repositories, many implementers of PREMIS are integrating METS with PREMIS metadata along with other information about 83 and links to the digital objects. For example, the project or in an external registry, or whether metadata units are Towards Interoperable Preservation Repositories (TIPR) recorded explicitly or known implicitly because of being undertaken by FCLA, Cornell and New York repository policies. The principle of technical neutrality University is using PREMIS embedded in METS as part allows for applicability in a wide range of contexts, of a standard Repository Exchange Package format. regardless of the specific type of implementation used for Because both PREMIS and METS allow for a great deal collecting, storing, maintaining, and exchanging the of flexibility in their implementation, using these two PREMIS metadata. This sort of flexibility allows an digital library standards together presents issues institution to use the specification as a key piece of its concerning duplication and management of metadata. As infrastructure and to adapt it to its own needs. However, an attempt to address such issues, a working group there is the disadvantage that implementers then must comprised of PREMIS and METS experts participated in make their own particular local system decisions and the creation of a set of guidelines for a common establish local repository policies, which could affect the exchange standard, which is now being tested in digital preservation repositories. ability to exchange digital objects and their metadata with other institutions. The PREMIS Working Group established a data PREMIS Background and Principles model, which was meant to clarify the meaning and use of the semantic units in the Data Dictionary. It was not Many institutions of different types and environments intended to prescribe an architecture for implementation, throughout the world have adopted the PREMIS Data but defined the conceptual entities with which repositories Dictionary for Preservation Metadata as they attempt to would need to interact. The entities in the data model are assume responsibility for preservation of their digital Objects, Agents, Events, and Rights; Intellectual Entities assets. This comprehensive specification was first issued are largely out-of-scope but links to them are defined. as version 1.0 in May 2005 and was then revised as version 2.0 in March 2008. It is maintained by the PREMIS Editorial Committee and Maintenance Activity. In the years since its publication, some countries have begun to embrace PREMIS as part of their preservation infrastructure and mandated its use for certain projects. For instance Spain mandates that PREMIS be implemented in every digitization project funded by the Ministry of Culture. The PREMIS Data Dictionary defines “preservation metadata” as the information a repository uses to support the digital preservation process. Specific preservation functions supported by the metadata are the maintainance of viability, renderability, understandability, authenticity, and identity of digital objects in a preservation context. Different categories of metadata may be considered Figure 1: PREMIS Data Model “preservation metadata,” including administrative (i.e. management metadata including rights and permissions), PREMIS may be implemented in a variety of ways, technical (i.e. technical characteristics, often format but, since XML is commonly used for expressing specific) and structural (i.e. information about the metadata, an XML schema is available to facilitate relationships between parts of an object). The implementation. Use of the PREMIS data model is documentation of digital provenance (the history of an evident in the schema design, since it associates object) was considered particularly important as well as appropriate XML elements with each of the PREMIS the documentation of relationships, especially entities (Object, Events, Agent, or Rights) to which they relationships among different objects within the apply. preservation repository. The PREMIS Editorial Committee provided a new The PREMIS Working Group, which originally feature in version 2.0 to allow implementations to include developed the PREMIS Data Dictionary, worked on the additional local metadata or to provide additional principle that the specification would be technically structure or granularity of metadata when PREMIS neutral. No assumptions are made as to the specific semantic units were not adequate. This extensibility mechanism is available for the following semantic units: digital archiving system used, the database architecture, significantProperties, objectCharacteristics, or the archiving technology. In addition the Data creatingApplication, environment, signatureInformation, Dictionary does not specify details about metadata eventOutcomeDetail, and rights. A container element management, such as whether metadata is stored locally corresponding to each of these semantic units is available 84 with "extension" added to the element name. This are Producers, Managers, and Consumers. Their roles mechanism provides the flexibility to include metadata map to the exchange, archiving, and dissemination of defined outside of PREMIS but to include it within the digital objects for which METS is well suited and widely same preservation metadata description. Of particular employed. interest is objectCharacteristicsExtension, which allows An information model is also defined within the for including format-specific metadata, which is out of OAIS standard for the contents of the repository upon scope for PREMIS itself, into a PREMIS metadata which the agents fulfill their functions. An Information container. Object is defined that is identical to the complex digital objects discussed in this paper. The information object is METS as an OAIS Information Package for comprised of a data object and associated representation Objects and Metadata information. The data object is all of the contents of the information object, physical or digital. Representation The Metadata Encoding and Transmission Standard information is that which is included in a METS (METS) defines a single document format for describing document—the structure and associated metadata of the the structure of complex digital objects and associating components of a digital object. A METS package is a various kinds of metadata with their components. The good candidate for realization of an information

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us