The Indexer Vol 24 No 2 October 2004
Total Page:16
File Type:pdf, Size:1020Kb
A new standard for controlled vocabularies Emily Fayen This article reviews the changes in the information industry that led NISO to propose a revision of ANSI/NISO Z39.19, Guidelines for the Construction, Format, and Management of Monolingual Thesauri, one of its most frequently requested Standards. In spite of its age – Z39.19 was first presented as a Standard in 1974 – it is still relevant in many parts of the information community. The Standard has been revised twice since its inception, most recently in 1993. The limitations of the existing Standard and the scope of the planned revisions are described. The article concludes with the status of the work in progress and plans for its release. When I was fresh out of undergraduate school with a newly importance it holds in the information community. Z39.19 minted major in math and physics, I planned to continue my is the primary source for guidance in the construction, studies, but had no idea where to focus my work. As a result, format, and maintenance of this special type of controlled I took a position as an Abstracter/Indexer with Documenta- vocabulary. tion, Incorporated, or DocInc as it was familiarly known. At the time, the company had the contract to prepare the Background content for NASA’s Scientific and Technical Aerospace Reports (STAR). The original NASA Thesaurus, which The first edition of this Standard, published in 1974, was predated ANSI/NISO Z39.19 by several years, supplied the prepared by Subcommittee 25 on Thesaurus Rules and terms to be used for indexing the reports that were included Conventions of American National Standards Committee in the NASA database. Mortimer Taube, the founder of Z39 on Standardization in the Field of Library Work, Docu- Documentation, Incorporated, was one of the early propo- mentation, and Related Publishing Practices (later known nents of what later became known as post-coordinated as Library and Information Sciences and Related Publishing retrieval; that is, where concepts are indexed using terms Practices). The subcommittee drew heavily on standards of that may be combined during the search process to achieve practice developed by the Engineers Joint Council, the the desired level of specificity. The NASA Thesaurus and Committee on Scientific and Technical Information of the others developed at about the same time became the core of Federal Council for Science and Technology, and the work that led eventually to ANSI/NISO Z39.19. The UNESCO. current revision of this Standard is the subject of this paper. When Z39.19 was first conceived, terms selected from a thesaurus were generally applied when indexing various Introduction collections of documents. These might be printed resources such as journal articles, technical reports, or newspaper arti- ANSI Z39.19-1974, Thesaurus Structure, Construction and cles. As new information storage and retrieval systems have Use, was first issued in 1974 and revised in 1980. In 1993, a emerged, the concept of ‘document’ has been extended to second revision was issued under the title ANSI/NISO include materials such as patents, chemical structures, maps, Z39.19-1993, Guidelines for the Construction, Format, and music, videos, museum artifacts, and many other types of Management of Monolingual Thesauri. The 1993 revision material that are not traditional documents. Furthermore, draws heavily on the international (ISO 2788) and British the display methods described were almost entirely for (BS 8723) standards. Even since 1993, when the Standard various sorts of printed products. In today’s online world, was last revised, vast changes have occurred in the informa- other methods of organization and display must be taken tion industry. These have resulted from very rapid changes into account. in computer and information processing technology and the In 1998, the Standard was reviewed and reaffirmed. At this global rise of the internet. Today, the expanding use of time, however, the review confirmed a need for a fresh look at information databases in all aspects of business and the Standard, updating it for use in the rapidly evolving elec- commerce, government, and education, and the need to tronic information environment. In response, NISO organized discover millions of sites on the internet, mean that there are a national Workshop on Electronic Thesauri, held November thousands of applications in which controlled vocabularies 4–5, 1999, to investigate the desirability and feasibility of devel- of various types provide better ways to manage large oping a standard for electronic thesauri. The workshop was co- amounts of content while at the same time making it easier sponsored by the American Psychological Association (APA), for users to find the information they need.Thirty years after the American Society of Indexers (ASI), and the Association its introduction, ANSI/NISO Z39.19 is the Standard most for Library Collections and Technical Services (ALCTS), a frequently requested for download from the NISO site. The division of the American Library Association. The current strong interest in this standard provides evidence of the project to revise Z39.19 grew out of the recommendations 62 The Indexer Vol. 24 No. 2 October 2004 Fayen: A new standard for controlled vocabularies developed by consensus at the Workshop. The report on the Sabine Kuhn Chemical Abstracts Service Workshop on Electronic Thesauri, November 4–5, 1999 is Pat Kuhr H.W. Wilson Company available on the NISO web site at http://www.niso.org/ Diane McKerlie DMA Consulting news/events_workshop/thes99rpt.html. Peter Morville Semantic Studios The Workshop identified a number of limitations of the Stuart Nelson National Library of Medicine existing Standard: Diane Vizine-Goetz OCLC, Inc. Trish Yancey Synapse Corporation G It is difficult for non-lexicographers to understand. Many Marcia Lei Zeng Special Libraries Association potential users who expressed interest in the Standard had no background in library science or related fields, and Cynthia Hodgson (NISO) and Emily Gallup Fayen thus found the concepts difficult to apply to their partic- (MuseGlobal, Inc. and NISO SDC Liaison) are preparing ular applications even though many recognized the need the revision. to do so. G It is focused on construction and maintenance. The Stan- NISO’s goal for the revised Z39.19 Standard dard assumed knowledge of the underlying principles of information science that promoted the use of controlled In February 2003 NISO conducted a survey to learn more vocabularies. about how Z39.19 was being used. The survey results G It is limited to document indexing applications. Although showed that the respondents wanted several things from the the original context for controlled vocabularies was for revision: indexing and retrieval of documents, in the intervening G The revised standard should provide a better, more inclu- years it became highly desirable to apply the underlying sive way to represent content; that is, the standard should discipline to many different types of materials including be applicable to a broader array of materials than websites. documents. G It is limited to post-coordinate retrieval. The Standard G The revision must take into account a changing audience assumed that the controlled vocabularies within its scope as well as a vastly different information environment. were to be used in post-coordinate retrieval systems. This G As the number of information resources that use assumption limited its applicability to other types of controlled vocabularies grows, there is increasing need retrieval, including browse and navigation systems. for interoperability and sharing across applications. G It is limited to printed products. The display formats for the controlled vocabularies that were recommended included only printed presentations of the controlled vocabularies. The scope of the revision Because of the date the work was first conceived (and As a first step, the Advisory Group discussed several ways in even at the time of its last revision in 1993), virtually no which the scope of the standard could be broadened and controlled vocabularies were being used in a web-enabled changed to meet the changing needs of implementers. environment. G It uses outdated technology. Finally, although the princi- G Expand the scope beyond thesaurus to include controlled ples presented in the original Standard are still relevant, vocabularies. This change in scope reflected the need to many of the examples were based on outdated tech- make the Standard applicable to controlled vocabularies nology. This needed to be updated to make the Standard other than those that had been used so extensively in the relevant to contemporary users. indexing of documents by the various abstracting and indexing (A&I) services. The title of the revision will be Relying on feedback from the community and extensive Guidelines for the Construction, Format, and Management internal discussions, NISO launched an initiative to revise of Controlled Vocabularies. Z39.19. The work has been made possible by generous G Make the Standard more accessible to users. The original support from the H.W. Wilson Company, the Getty Foun- Standard had been developed by lexicographers for lexi- dation, and the National Library of Medicine. With the aim cographers. It assumed that readers were familiar with the of achieving the best possible result, and making sure the underlying concepts and principles of vocabulary control. major stakeholders were involved, NISO assembled an advi- This is no longer true for the greatly expanded audience sory group to guide the work. The Thesaurus Advisory for the Standard, which is resulting in the frequent Group, or TAG as it has become known, consists of requests for download from the NISO website. members from many segments of the information industry. G Explain important concepts. Because many potential users The members are: of this Standard do not have a background in library Vivian Bliss Microsoft science or information science, it is important to explain Carol Brent ProQuest the important concepts so that users will understand the Dave Clark Synapse Corporation reasons behind some of the rules and guidelines.