Chapter 7: Learning about By Jennifer Phillips

/A Introduction

A basic understanding of metadata – its principles, standards, and best practices – can go a long way toward launching your career in digital librarianship. At first metadata may seem like a somewhat mystifying branch of cataloging or an issue of concern primarily to software engineers and computer programmers. But metadata is not an obscure topic just for technical people. Familiarity with the basic principles of metadata is necessary for all people working in digital librarianship, and a solid foundation can help you stand out professionally.

If we think of metadata in terms of its relationship to core principles of librarianship, it becomes more approachable. This chapter will provide an overview of the concepts and define the terminology used in discussions of metadata. It is intended to be a high-level discussion of metadata rather than an explanation of the nuts and bolts of metadata implementation, which will be addressed in Chapter 8. If you come from a technical services background or if you focused on cataloging in library school, many of the ideas here may already be familiar to you. If instead you come from a public services background, or are new to library and information science in general, this chapter will familiarize you with metadata and the issues surrounding it in a library context.

The goal of this chapter is to define metadata in a way that invites you to think about how it pertains to both the public service and technical aspects of digital library work. Another aim is to introduce you to or refresh your memory about categories of metadata and metadata standards, so that you will be able to articulate the importance of metadata for modern libraries.

Demonstrating an understanding of metadata and how it relates to the librarian’s job of assisting

1 in the discovery, access, and use of information resources can be extremely useful when trying to

get involved in digital projects.

/A What is metadata?

Metadata is difficult to define briefly, because the term is used for a variety of kinds of information that describe other information. The most commonly used definition is “data about data,” but this is incomplete. To understand metadata in a general sense, it is important to bear in mind a few key points:

• metadata is information or data that is associated with other information resources

• metadata is structured information

• metadata is used to enable a range of functions with respect to the resource it

describes

While metadata is a type of information that is always about other information, it can be about any form of information. In other words, metadata can describe information resources of all types – from physical books and images to web sites, audio files, datasets and software. It can be stored in a database, separate from the resources it explains, or it can be embedded in the digital files it describes. Because it is structured, metadata can be machine processed, and it is therefore fundamental to the way that information resources function and are used in an electronic environment. Finally, part of the definition of metadata should include its purpose, which is to support the description, discovery, use, management, and preservation of information resources.

A few familiar examples of metadata can help clarify the concept and illustrate the contexts in which some forms of metadata have been developed. Most of us have encountered data about digital files that is stored within the files, without necessarily thinking about it as metadata. For example, the Apple iTunes application for managing music files (MP3s) on a

2 home computer displays songs according to the categories name, artist, album, time, track number, and genre. These are all elements of metadata encoded in the ID3 tag at the end of an

MP3 file. This file-based metadata is displayed in the iTunes interface and gives the user the ability to sort and search for songs according to these properties.

Another example, which also illustrates how file-based metadata can be in part system- generated and in part supplied by the user, is the properties of a Microsoft Word document. In

Word, you can view characteristics of a file and information about its content. The system- generated information includes the date created, date modified, size, and file type; the metadata the user can supply includes the author, title, subject, keywords and a description. This metadata allows for input from the user on the one hand, and on the other facilitates system-based operations such as the interaction of the file with the software application or operating system.

You can organize, identify, and search for your documents based on both the values you have specified and the automatically generated properties.

3

Since the metadata associated with an MP3 or Microsoft Word file allows the user to describe, arrange, search for and select their files, these everyday examples of metadata show how metadata supports these user tasks.

Metadata has evolved from several different communities including library and information science, records management, database design, and software design. One example of metadata that most librarians are already familiar with is the MARC (Machine-Readable

Cataloging) record. MARC is based on a set of rules, the International Standard Bibliographic

Description (ISBD), and is designed specifically for bibliographic data to meet the needs of the library community. A MARC bibliographic record is a source of information about a bibliographic resource (book, serial, sound recording, video recording, etc.), and when you look at an online library catalog you are being presented with a view of MARC records. MARC takes the information that describes the intellectual and physical characteristics of a resource and structures it in such a way that allows it to be displayed in catalogs and shared with other systems. The MARC format for bibliographic records defines the data elements – units of data with specific meaning – and the codes used for encoding bibliographic data. For example,

4 MARC defines the data element “title and statement of responsibility” and puts it in the “245” field. Indicator and subfield codes characterize and further mark up the data contained within the field. The first indicator indicates whether there should be an added entry for the title in the library catalog, and subfield “a” distinguishes the title from the statement of responsibility in subfield “c.” Personal author information goes in the “100” field, and imprint information

(publication, distribution, etc.) goes in the “260.” As such, the basic elements for an edition of

Herman Melville’s “Moby Dick” would be encoded in MARC as follows:

100 1 $aMelville, Herman,$d1819-1891. 245 10 $aMoby-Dick, or, The whale /$cHerman Melville ; foreword by Nathaniel Philbrick. 260 $aLondon :$bPenguin,$c2009.

Thus structured and encoded, bibliographic metadata can be interpreted and displayed by library system software and exchanged with other agencies, regardless of the language of the content.

MARC enables the discovery, retrieval, and use of resources by making them searchable in library catalogs according to a broad set of elements, including the title, statement of responsibility, publication information, physical description, information specific to medium, and subject.

/A What is the purpose of metadata?

Metadata supports the use of information resources in a digital environment. As you consider the relevance of metadata to your career as a digital librarian, it may be useful to think about how metadata reflects the core values of librarianship in general and the principles that underlie library cataloging in particular. There is a clear example of this in the case of digital libraries, where