Research Article
Total Page:16
File Type:pdf, Size:1020Kb
286 American Archivist / Vol. 52 / Summer 1989 Research Article Downloaded from http://meridian.allenpress.com/american-archivist/article-pdf/52/3/286/2747828/aarc_52_3_g562600um1063123.pdf by guest on 29 September 2021 Authority Control Issues and Prospects DAVID BEARMAN Abstract: Research on authority control reported in archival, library, and information science literature suggests that efforts to control topical subject terminology are inappro- priate and ineffective in an archival setting because researchers are unlikely to use the same terminology as that contained in the documents, and because most users value precision over recall (inclusiveness) in their searching. The author argues that archival retrieval will be enhanced by placing more emphasis on increasing the number of access points and less on achieving consistency in indexing. He describes various kinds of au- thority files and identifies several (occupation, time period, geographic coordinates, form- of-material, and function) that offer the most promise. He advocates the use of existing reference files and cooperative development of new ones, to be used not only in the traditional authority-control sense but also as valuable information resources in their own right. About the author: David Bearman is the editor of Archives and Museum Informatics and Archives and Museum Informatics Technical Reports and is the founder of Archives & Museum Informatics, a consulting firm in Pittsburgh, Pennsylvania. He has written extensively on issues related to archival automation and archival information exchange. Authority Control 287 OVER THE PAST FEW years, authority con- on manual updating, break down and thereby trol has received increased attention in the fail to deliver on their promise.2 archival community.1 Experience in local Given the substantial intellectual, tech- automation applications and involvement nical, and administrative overhead in- in national bibliographic databases has volved in authority control, it is not Downloaded from http://meridian.allenpress.com/american-archivist/article-pdf/52/3/286/2747828/aarc_52_3_g562600um1063123.pdf by guest on 29 September 2021 heightened archivists' awareness of both surprising that researchers have sought to system design and data quality issues. Au- determine whether vocabulary control thority control, as currently practiced in ar- works, how it could be made to work bet- chives and library systems, employs a ter, what alternatives exist, and if it is the controlled vocabulary to restrict what is en- best strategy to improve retrieval. Al- tered in the fields of a cataloging database though the conclusions of the research lit- to values contained in external authority erature are far from uniform, two findings files. have emerged with great regularity. First, The decision to rely on authority files to vocabulary control works best when the impose consistency in the database leads to terminology of the documents and that of a series of demanding requirements. Some- the researchers are highly consistent, that where professionals must create and update is, when the collections and the use of them 3 the authority files. Indexers, catalogers, and are relatively homogeneous. Second, lim- others who assign terms for access points iting access points to authorized terms gen- must check look-up authority files to be erally results in lower recall (fewer finds) sure that their proposed terms are recog- and higher precision (fewer false finds). In nized, either as "preferred" terms or al- any case, authority control is less effective ternative values. Then the indexers may have overall than the introduction of richer lead- to do further research to determine whether in vocabularies, i.e., additional (uncon- or not the persons, places, or things in the trolled) terms that point to the appropriate 4 item or collection being cataloged are iden- controlled terms. tical to those of like name in the authority Taken alone, these conclusions should file. If valid lists are to be maintained, new discourage archivists from any further ef- terms must be confirmed by research in ex- forts to employ authority control for topical ternal sources and any conflicts must be subject access points, because neither of resolved and/or reflected in the database. the necessary situations exists for subject As a consequence of these requirements, access to archival information systems: use many systems that supposedly employ au- thority control, especially those dependent 2Joseph W. Palmer, "Subject Authority Control & Syndetic Structure: Myth & Realities: An Inquiry into 'Much of this literature is discussed, evaluated, and certain subject heading practices and some questions usefully added to by the authors published in Avra about the implications," Cataloging & Classification Michelson, ed., Archives and Authority Control, Ar- Quarterly, vol.7, no.2: 71-95; Catherine M. Thomas, chival Informatics Technical Report, vol.2, no.2 "Authority Control in Manual vs. Online Catalogs: (Summer 1988). Individual contributors are: Jackie An examination of 'See' references," Information Dooley, "An Introduction to Authority Control for Technology and Libraries 3 (December 1984): 393- Archivists," 5-18; Tom Garnett, "Development of an 8. Authority Control System for the Smithsonian Insti- 3Carol Tenopir, "Full Text database retrieval per- tution Libraries," 21-27; Marion Matters, "Authority formance," Online Review 9 (1985): 149-164; Ten- Files in an Archival Setting," 29-33; Avra Michel- opir, "Searching by Controlled Vocabulary or Free son, "Descriptive Standards and the Archival Profes- Text," Library Journal 112 (15 November 1987): 58- sion," 1-4; Richard Szary, "Technical Requirements 59. and prospectus for authority control in the SIBIS— "Katherine W. McCain, Howard D. White, and Archives Database," 41-44; Lisa Weber, "Develop- Belver C. Griffith, "Comparing retrieval performance ment of Authority Control Systems Within the Ar- in online databases," Information Processing & Man- chival Profession," 35-40. agement 23 (1987): 539-553. 288 American Archivist / Summer 1989 of subject terminology, both within archi- vocabulary searches revealed differences in val collections and by researchers, is no- retrieval described only as differences, and toriously heterogeneous; and the archival recommended searching both controlled and literature has been fairly consistent in ar- uncontrolled terminology. The most influ- guing that archival users value recall over ential of these studies reported that full text Downloaded from http://meridian.allenpress.com/american-archivist/article-pdf/52/3/286/2747828/aarc_52_3_g562600um1063123.pdf by guest on 29 September 2021 precision.5 This hypothesized failure of searches provide better recall but poorer topical subject-based authority control has precision.8 been empirically demonstrated by Avra The most comprehensive review of sub- Michelson.6 It would follow, then, that ar- ject authorities in library systems over the chivists should stop wasting their time on past twenty years, conducted by a partisan the effort to control topical subject termi- of vocabulary control, Elaine Svenonius, nology and instead should look for findings recently concluded that subject authority that can lead to more strategic approaches control isn't working.9 Svenonius assigns to vocabulary control. the blame for the failure of authority con- It is not necessary to dismiss all types of trol to a combination of what she calls "in- authority control for all archival purposes; trinsic variables," such as the quality of the research literature suggests that certain the vocabulary and the nature of the dis- kinds of authority control, if correctly im- cipline that it represents, and "external plemented, can benefit specific types of uses. variables," such as the skills of indexers Rather than emphasize the headings-man- and searchers and the criteria for retrieval agement aspects of name-authority control, evaluation. She particularly finds that the archivists should focus substantially greater problem lies in lack of distinction between efforts on building cooperative reference files types of controlled vocabularies and con- for occupation, function, geographic-co- cludes that "it would seem that the vocab- ordinate, time-period, and form-of-mate- ulary of a discipline should precede any rial terms; these will provide access to attempt to control that vocabulary for in- people, organizations, places, events, and formation retrieval."10 Her conclusions are records, respectively. not unlike the findings of Bhattacharyya, and the earlier Cranfield II experiments, When Does Authority Control Work? which demonstrated that the greater the ter- minological consistency in the field, the The first Cranfield experiments and sim- greater the benefit of vocabulary control.11 ilar large retrieval experiments of the early 1960s demonstrated that simple controlled Subject based authority control in ar- vocabularies with normalized word endings chives also fails in applications because (dropping suffixes such as ing and ed) and synonymy performed better than full vo- 7 "Karen Markey, Pauline Athcrton, and Claudia cabulary control. Subsequent studies com- Newton, "An analysis of controlled vocabulary and paring natural language and controlled free text search statements in online searches," On- line Review 4 (1980): 225-236. 9Elaine Svenonius, "Unanswered Questions in the Design of Controlled Vocabularies," Journal of the 5Mary Jo Pugh, "The Illusion of Omniscience: