<<

Knowl. Org. 46(2019)No.4

KO KNOWLEDGE ORGANIZATION

Official Journal of the International Society for Knowledge Organization ISSN 0943 – 7444 International Journal devoted to Concept Theory, Classification, Indexing and Knowledge Representation

Contents

Articles Reviews of Concepts in Knowledge Organization

Sandra Johansson and Koraljka Golub. Howard D. White. LibraryThing for Libraries: How Tag Moderation Patrick Wilson ...... 279 and Size Limitations Affect Tag Clouds ...... 245 Richard P. Smiraglia. Emma Quinlan and Pauline Rafferty. Work ...... 308 Astronomy Classification: Towards a Faceted Classification Scheme ...... 260 Jarmo Saarti. Fictional Literature, Classification and Indexing ...... 320

Books Recently Published ...... 333

Knowl. Org. 46(2019)No.4

KNOWLEDGE ORGANIZATION KO

Official Journal of the International Society for Knowledge Organization ISSN 0943 – 7444 International Journal devoted to Concept Theory, Classification, Indexing and Knowledge Representation

KNOWLEDGE ORGANIZATION José Augusto Chaves GUIMARÃES, Departamento de Ciência da Informacão, Universidade Estadual Paulista–UNESP, Av. Hygino Muzzi This journal is the organ of the INTERNATIONAL SOCIETY FOR Filho 737, 17525-900 Marília SP Brazil. E-mail: [email protected] KNOWLEDGE ORGANIZATION (General Secretariat: Amos DA- VID, Université de Lorraine, 3 place Godefroy de Bouillon, BP 3397, Michael KLEINEBERG, Humboldt-Universität zu Berlin, Unter den 54015 Nancy Cedex, . E-mail: [email protected]. Linden 6, D-10099 Berlin. E-mail: [email protected]

Editors Kathryn LA BARRE, School of Information Sciences, University of Illi- nois at Urbana-Champaign, 501 E. Daniel Street, MC-493, Champaign, IL Richard P. SMIRAGLIA (Editor-in-Chief), Institute for Knowledge Or- 61820-6211 USA. E-mail: [email protected] ganization and Structure, Shorewood WI 53211 USA. E-mail: [email protected] Devika P. MADALLI, Documentation Research and Training Centre (DRTC) Indian Statistical Institute (ISI), Bangalore 560 059, India. Joshua HENRY, Institute for Knowledge Organization and Structure, E-mail: [email protected] Shorewood WI 53211 USA. Daniel MARTÍNEZ-ÁVILA, Departamento de Ciência da Informação, Peter TURNER, Institute for Knowledge Organization and Culture, Universidade Estadual Paulista–UNESP, Av. Hygino Muzzi Filho 737, Shorewood WI 53211 USA. 17525-900 Marília SP Brazil. E-mail: [email protected]

J. Bradford YOUNG (Bibliographic Consultant), Institute for Knowledge Widad MUSTAFA el HADI, Université Charles de Gaulle Lille 3, URF Organization and Structure, Shorewood WI 53211, USA. IDIST, Domaine du Pont de Bois, Villeneuve d’Ascq 59653, France. E-mail: [email protected] Editor Emerita H. Peter OHLY, Prinzenstr. 179, D-53175 Bonn, Germany. Hope A. OLSON, School of Information Studies, University of Wiscon- E-mail: [email protected] sin-Milwaukee, Milwaukee, Northwest Quad Building B, 2025 E New- port St., Milwaukee, WI 53211 USA. E-mail: [email protected] M. Cristina PATTUELLI, School of Information, Pratt Institute, 144 W. 14th Street, New York, New York 10011, USA. Series Editors E-mail: [email protected]

Birger HJØRLAND (Reviews of Concepts in Knowledge Organization), K. S. RAGHAVAN, Member-Secretary, Sarada Ranganathan Endowment Department of Information Studies, University of Copenhagen. E-Mail: for Library Science, PES Institute of Technology, 100 Feet Ring Road, [email protected] BSK 3rd Stage, Bangalore 560085, India. E-mail: [email protected].

María J. LÓPEZ-HUERTAS (Research Trajectories in Knowledge Heather Moulaison SANDY, The iSchool at the University of Missouri, Organization), Universidad de Granada, Facultad de Biblioteconomía y 303 Townsend Hall, Columbia, MO 65211, USA. Documentación, Campus Universitario de Cartuja, Biblioteca del Colegio E-mail: [email protected] Máximo de Cartuja, 18071 Granada, . E-mail: [email protected] M. P. SATIJA, Guru Nanak Dev University, School of Library and Infor- Editorial Board mation Science, Amritsar-143 005, India.

E-mail: [email protected] Thomas DOUSA, The University of Chicago Libraries, 1100 E 57th St, Chicago, IL 60637 USA. E-mail: [email protected] Aida SLAVIC, UDC Consortium, PO Box 90407, 2509 LK The Hague, The Netherlands. E-mail: [email protected] Melodie J. FOX, Institute for Knowledge Organization and Structure, Shorewood WI 53211 USA. E-mail: [email protected]. Renato R. SOUZA, Applied Mathematics School, Getulio Vargas

Foundation, Praia de Botafogo, 190, 3o andar, Rio de Janeiro, RJ, 22250- Jonathan FURNER, Graduate School of Education & Information Stud- 900, Brazil. E-mail: [email protected] ies, University of California, Los Angeles, 300 Young Dr. N, Mailbox 951520, Los Angeles, CA 90095-1520, USA. Rick SZOSTAK, University of Alberta, Department of Economics, 4 E-mail: [email protected] Edmonton, Alberta, , T6G 2H4. E-mail: [email protected]

Claudio GNOLI, University of Pavia, Science and Technology Library, Joseph T. TENNIS, The Information School of the University of Wash- via Ferrata 1, I-27100 Pavia, Italy. E-mail: [email protected] ington, Box 352840, Mary Gates Hall Ste 370, Seattle WA 98195-2840 USA. E-mail: [email protected] Ann M. GRAF, School of Library and Information Science, Simmons University, 300 The Fenway, Boston, MA 02115 USA. Maja ŽUMER, Faculty of Arts, University of Ljubljana, Askerceva 2, E-mail: [email protected] Ljubljana 1000 Slovenia. E-mail: [email protected]

Jane GREENBERG, College of Computing & Informatics, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104 USA, E-mail: [email protected]

Knowl. Org. 46(2019)No.4 245 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds Sandra Johansson* and Koraljka Golub** *Boarpsvägen 310, 266 97 Hjärnarp, Sweden, **Linnaeus University, School of Cultural Sciences, Department of Library and Information Science, Faculty of Arts and Humanities, 351 95 Växjö, Sweden,

Sandra Johansson earned her bachelor’s degree in library and information science at Linnaeus University, Sweden in January, 2018 by conducting the study presented in this article. She is especially interested in ways to develop and adapt the library as our society and needs change, and is currently working as a librarian with special focus on young adults at a local library in a small town in Sweden.

Koraljka Golub is an associate professor at Linnaeus University, Sweden. Her research primarily focuses on topics related to information retrieval and knowledge organization. Of her particular interest is integration of tradi- tional knowledge organization systems with social tagging and/or automated subject indexing and evaluating results in the context of end-user information retrieval. Details of her research projects and related activities are available at her website .

Johansson, Sandra and Koraljka Golub. 2019. “LibraryThing for Libraries: How Tag Moderation and Size Lim- itations Affect Tag Clouds.” Knowledge Organization 46(4): 245-259. 33 references. DOI:10.5771/0943-7444-2019- 4-245.

Abstract: The aim of this study is to analyse differences between tags on LibraryThing’s web page and tag clouds in their “LibraryThing for Libraries” service, and assess if, and how, the LibraryThing tag moderation and limi- tations to the size of the tag cloud in the library catalogue affect the description of the information resource. An e-mail survey was conducted with personnel at LibraryThing, and the results were compared against tags for twenty different books, collected from two different library catalogues with disparate tag cloud sizes, and LibraryThing’s web page. The data were analysed using a modified version of Golder and Huberman’s tag cate- gories (2006). The results show that while LibraryThing claims to only remove the inherently personal tags, several other types of tags are found to have been discarded as well. Occasionally a certain type of tag is included in one book, and excluded in another. The comparison between the two tag cloud sizes suggests that the larger tag clouds provide a more pronounced picture regarding the contents of the book but at the cost of an increase in the number of tags with synonymous or redundant information.

Received: 20 January 2019; Revised: 4 May 2019; Accepted: 10 May 2019

Keywords: tags, library catalogues, LibraryThing, LTFL, tag clouds

1.0 Introduction Importing tags from an external, well-established source such as LibraryThing (https://www.librarything. End-user tagging is a popular service of many online in- com), presents a strong candidate for enhancing library formation systems, providing users with opportunities for catalogues by social tags. This is particularly pertinent to personal and collaborative interactive information organi- tags for literary fiction for which commonly used subject zation and retrieval. While advantages such as additional indexing languages in libraries often do not suffice. Li- access points representing users’ perspectives have been braryThing offers a library service, “LibraryThing for Li- identified in the literature, at the same time absence of pol- braries” (hereinafter shortened LTFL), which allows data icies may prevent successful retrieval. Moreover, the users from LibraryThing, such as tags, ratings and comments, to need to be willing to contribute to the system, a character- be imported into library catalogues. Tags in LibraryThing istic which has been shown to be lacking in many library undergo a manual control before they are incorporated catalogues with tagging features. into LTFL and impose a limitation regarding the size of tag clouds in library catalogues. 246 Knowl. Org. 46(2019)No.4 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

In order to better understand the advantages and disad- 2006). Furthermore, it is the great variation of perspec- vantages of importing tags from existing tagging services tives represented that makes a folksonomy potentially use- to library catalogues, this study aims to examine how Li- ful to all people, regardless of subject knowledge and so- braryThing’s tag moderation process of tags in LTFL cial and cultural backgrounds (Spiteri 2006; Steele 2009). works, to analyse the differences between the tag clouds In her recent review of tagging literature, Rafferty (2018, on LibraryThing’s website and tag clouds in LTFL, and in- 510) concludes that tagging, being largely dependent on vestigate impact that LibraryThing’s predetermined op- the taggers, may underperform in comparison to estab- tions for different tag cloud sizes have on library catalogue lished subject indexing systems, they still “complement, records. The sample includes data collected from two pub- enrich, and … enhance conventional retrieval systems.” lic library catalogues using LTFL: South Central Library Over time, as the number of tags from many different System in Wisconsin, USA (http://www.scls.info/), and users increases, stable patterns and a common frame of Spokane County Library District (https://scld.ent.sirsi. reference emerge (Fox 2012; Lin et al. 2006). Users com- net/) in Washington, USA. Tags assigned to twenty literary municate with, and learn from each other, and a general fiction books at both libraries were collected from the two consensus appears regarding which terms match an infor- library catalogues and from the LibraryThing web site and mation resource best (Golder and Huberman 2006). The then compared against one another using a modified ver- more users who assign the same tag to an information re- sion of seven tag categories identified by Golder and Hu- source, the greater the likelihood that the tag is relevant; at berman (2006). that point, unique and personal tags become less visible. The remainder of the paper is structured as follows: Still, minority and divergent opinions can coexist with the Section 2 (Previous research) discusses (dis)advantages of majority if users are given the possibility to switch between end-user tagging and folksonomies, especially in relation viewing the most popular tags and the full tag collection to literary fiction; Section 3 (Methodology) describes the (Golder and Huberman 2006; Spiteri 2006; Steele 2009). sample and methodology used to conduct the study, in- In order to address the challenges, different approaches cluding the modified version of Golder and Huberman’s have been proposed in the literature. Guy and Tonkin categories (2006); Section 4 (Results and analysis) presents (2006) recommend an introduction of rules and guidelines and analyses the collected data; and, Section 5 (Conclu- for tagging, as well as automatic tag suggestions when cre- sion) provides some final thoughts and outlines sugges- ating new tags. They also warn that excessive control and tions for future research. regulation of tags could harm the strengths of folk- sonomies. Several researchers suggest that tags and con- 2.0 Previous research trolled vocabularies may complement each other well (Ad- ler 2009; Anfinnsen et al. 2011; Fox and Reece 2013; Research on social tagging and folksonomies (sets of tags Golub 2016; Kakali 2014; Kipp 2011; Rolla 2009; Spiteri resulting from social tagging) started when pioneering ser- and Pecoskie 2016; Steele 2009). Golub et al. (2014) pro- vices like Delicious (https://del.icio.us) and Flickr (https:// pose that one way to accomplish this would be to provide flickr.com) emerged, with the majority published in 2006 users with automatic suggestions of terms from an estab- and onwards (Furner 2010). The discussion on (dis)ad- lished vocabulary when they are about to create new tags. vantages of social tagging is centred on two major foci. Based on a user study of a prototype system, this proved Firstly, unlike professional indexing systems, there are no to help produce ideas of which tags to use, to make it eas- restrictions or rules on how tags should be designed or ap- ier to find focus for the tagging, to ensure consistency and plied: different users use different words for the same con- to increase the number of access points in retrieval. How- cept, homonyms are not disambiguated, hierarchical and ever, the value and usefulness of the suggestions showed other relationships between tags are often absent, tags may to be dependent on the quality of the suggestions, both as be written in different forms (singular/plural, spelling var- to conceptual relevance to the user and as to appropriate- iations etc.), they may be unlimited in quantity or may have ness of the terminology. relevance for personal use only (e.g., “to read”) (Furner Related to the findings that automatic suggestions help 2010; Gerolimos 2013; Golder and Huberman 2006; Guy find focus for tagging and increase the number of access and Tonkin 2006; Kipp et al. 2015; Rolla 2009; Steele points for retrieval are Munk and Mørk’s (2007a, 2007b) 2009). At the same time, they are characterized by the nat- studies of the social bookmarking service del.icio.us ural everyday language that the users are familiar with and whereby they have identified that most common are broad can relate to, especially if compared to more formal and tags while more specific ones are rare. They describe traditional subject indexing languages, which may not al- (2007b, 16) this as a bias which derives from “a cognitive ways reflect current terms and may contain outdated terms economizing through a simplification principle in the us- (Adler 2009; Bates and Rowley 2011; Furner 2010; Spiteri ers’ construction of descriptive metadata.” Knowl. Org. 46(2019)No.4 247 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

When it comes to incorporating social tags into library they do enrich the library catalogue and are useful when catalogues, Kakali’s (2014) survey of professional cata- searching on topics that the end-user is less familiar with, loguers shows that they have a positive attitude towards when gathering ideas for additional keywords, and when using tags as a complement to traditional subject indexing exploring related subjects. in the library catalogue. Wu, Xu and Yu (2016) claim that libraries have a unique role to play since they provide both 3.0 Methodology the metadata and the actual literature, something social media are unlikely to overcome. In addition, social tagging 3.1 Purpose and aims services in library catalogues hold the potential to help strengthen the relationship and communication between Since a relatively large amount of tags is needed in order libraries and its users. to create a stable pattern and for a general consensus to As to the value of tags in relation to established subject emerge regarding the description of an information re- indexing in libraries, Rolla (2009) compared tags from Li- source, rather than providing its own tagging service braryThing with subject headings from the Library of within the library catalogue, it may be advantageous for a Congress for forty-five books in literary fiction. The re- library to import tags from an external source, such as Li- sults showed that the tags were significantly more numer- braryThing. In order to contribute to a better understand- ous than the subject headings; an average of over forty tags ing of the implications of tags on retrieval in library cata- per book compared to fewer than four subject headings. logues which import tags from existing tagging services, The tags were broader and more general than the subject we aim to determine ways in which the selection process headings and contained several new or current concepts, of tags in LTFL affects the resulting tag clouds in library while the subject headings were superior when it came to catalogues, and whether LibraryThing’s predetermined op- specific historical periods, something Rafferty also found tions for tag-cloud sizes result in loss of valuable tags. Spe- in her study of image tagging (2011). Rolla’s conclusion cifically, the following three research questions are posed: was that while tags cannot replace a controlled vocabulary, they do improve access in the library catalogue, the end- 1. Why does LibraryThing moderate tags before approv- user and professional indexing complementing each other. ing them for LTFL, and how is the selection process of The value of both controlled vocabularies and tags for re- tags in LibraryThing carried out? trieval was concurred with by Kipp and Campbell (2010) 2. What are the results of the tag moderation process (tags and Golub et al. (2014) who showed that a number of ad- need to be manually approved in order to be included ditional access points in retrieval are provided by tags com- in LTFL), i.e., what differences can be found between pared to traditionally designed search systems. LibraryThing tags in a library catalogue and on Library- LTFL has been studied previously as well. Westcott, Thing website? Chappell and Lebel (2009) studied use of LTFL at the 3. What consequences do the predetermined options for Claremont University Library and inferred that their expe- tag-cloud size limitation in library catalogue records, di- riences with LTFL were mostly positive and of particular rected by the receiving library, have on the description benefit for foreign-language publications as well as literary of information resources in the library catalogue? fiction on certain themes. One of the main drawbacks of LTFL for them was that the tags are not searchable via the 3.2 Data collection OPAC search fields, since LTFL operates as an overlay; they are only searchable through the tag browser, which is 3.2.1 Selection of information services accessed via library records containing tags, something that Pirmann (2012) concur with in her usability study of The well-known online social cataloguing and networking LTFL in another online library catalogue. Voorbij (2012) site called LibraryThing was chosen as the source of data analysed a large sample of catalogue records and LTFL- for the study. LibraryThing contains bibliographic data col- tags in a Dutch academic library and found that about one lected from libraries and bookstores all over the world. In third of the records were provided with tags from LTFL. addition, end users may create an account and add books of Of those, about half of the tags were already covered by their preference, find similar books, participate in discus- a keyword in the library record, one quarter were broader sions, as well as rate, tag and review the books. At the time than a keyword, and another quarter of tags were related, of writing (October 2018), LibraryThing held metadata for narrower, or new. His estimation was that almost 40% of over 129 million books and 146 million tags, with over 2.3 the library records that contained tags could be considered million registered members worldwide (LibraryThing 2018). enriched by LTFL. Both Pirmann and Voorbij concluded Via LibWeb’s list of American public libraries that while tags cannot replace traditional subject headings, (http://www.lib-web.org/united-states/public-libraries/), 248 Knowl. Org. 46(2019)No.4 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

Author Publication Number of Author Title year tags (circa) The ultimate hitchhiker’s guide to the Douglas Adams English 1994 650 galaxy Julia Alvarez How the García girls lost their accents Dominican-American 1991 750 Margaret Atwood The handmaid’s tale Canadian 1985 4300 Jane Austen Pride and prejudice English 1813 5000 Dan Brown The lost symbol American 2009 2000 Catharina Ingelman-Sundberg The little old lady who broke all the rules Swedish 2012 170 Kazuo Ishiguro The buried giant Japanese-English 2015 800 Jo Nesbø The bat Norwegian 1997 700 Orhan Pamuk A strangeness in my mind Turkish 2014 180 J. R. R. Tolkien The fellowship of the ring English 1954 4700

Table 1. Adult literature included in the study, and approximate amount of tags on the LibraryThing website. and LibraryThing’s “Your Local” service (http://www.li- LibraryThing’s website. These included books of the dif- brarything.com/local), two libraries were selected by con- ferent following characteristics: venience; the first two public libraries found that a) used LibraryThing tags in their catalogue records, and b) do not – Author nationality: American, Canadian, Danish, Do- have the same size limit of the tag clouds for individual minican-American, English, Finnish-Swedish, Japanese- records in their library catalogues. Since the LTFL tags are British, Norwegian, Swedish, and Turkish; external, all libraries receive the same tags for a specific – : ten children and young adults’ books, and ten book; what may differ is the number of tags in the tag adults’ books; clouds, something each library can choose for themselves. – Original year of publication: from Austen in 1813 to The two selected libraries were Spokane County Library Ishiguro and Pamuk in 2015; District (https://scld.ent.sirsi.net/) in Washington, USA, – Tag size: The number of different tags assigned to the with a tag cloud limit of twenty-five tags in each library books has a wide range; from Thor and Ingelman- record, and South Central Library System in Wisconsin, Sundberg with just over 150 unique tags, to Austen and USA (http://www.scls.info/), with a tag cloud limit of fif- Rowling with about 5,000 unique tags each. Tag clouds teen tags per library record. for each of the twenty books with fifteen and twenty- five tags in the library catalogues were compared to 3.2.2 Book selection their 100 most popular tags on LibraryThing.

The focus of the study being selection of tags and size of Table 1 displays all the adult books included in the study, tag clouds, books were chosen that contained the maxi- including author, title, author’s nationality, original publi- mum number of tags in their tag clouds in both library cation year of the book, and the approximate amount of catalogues. Since Spokane County Library District has a tags on LibraryThing’s website. Note that the exact num- bigger tag cloud limit than South Central Library System, ber of tags for a book is not presented by LibraryThing, the former catalogue became the starting point, and the which is why the approximation is based on a symbol first forty books of literary fiction that were encountered search of “)” across the collected data in Microsoft Word in the catalogues written by different authors and with full for each of the books in the study. tag clouds imported from LibraryThing, were docu- Table 2 displays all the children and young adults’ books mented. These forty books were then looked up in the included in the study, including author, title, author - South Central Library System, and from the twenty-eight ality, original year of publication, and the approximate books found in both, a final selection of twenty fictional number of tags on LibraryThing’s website. books was made in the process described below. Having a heterogeneous sample as a target in order to 3.2.3 Tags identify a range of examples, the twenty books were cho- sen to represent different author , , years In the following step, the book tags from both library cat- of publication, and sizes of the total tag collection on alogues and from LibraryThing were documented. The to- Knowl. Org. 46(2019)No.4 249 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

Author Publication Number of Author Title nationality year tags (circa) Frances Hodgson Burnett A little princess English 1905 2400 Orson Scott Card Ender’s game American 1985 4200 Roald Dahl Charlie and the chocolate factory English 1964 3200 John Green The fault in our stars American 2012 2700 Tove Jansson Tales from Moominvalley Finnish-Swedish 1962 480 Lene Kaaberbøl The Shamer’s daughter Danish 2002 170 Astrid Lindgren Ronia, the robber’s daughter Swedish 1981 760 Mary Pope Osborne Dinosaurs before dark American 1992 1400 J. K. Rowling Harry Potter and the sorcerer’s stone English 1997 5000 Annika Thor The lily pond Swedish 1997 160

Table 2. Children and young adults’ literature included in the study and approximate amount of tags on the LibraryThing website.

Figure 1. Tag data for the tag “regency.” This tag has been used for the book “Pride and preju- dice” by Jane Austen, which is included in this study.

tal of 100 most popular tags on LibraryThing per book 3.3 Method were selected for analysis; LibraryThing ranking is based on the number of different users who add the same tag for The main method used was tag analysis, explained in more the book. Figure 1 shows an example of tag data on Li- detail below. In addition, in order to gain an understanding braryThing’s website. of the policies and practices related to LTFL’s tag moder- In cases when some tags have been used the same num- ation, an email survey questionnaire with relevant open- ber of times, we have strived to follow the same principle as ended questions was sent to the person in charge of LTFL LTFL and rank them by their overall popularity on Library- at LibraryThing. The questions concerned the tag moder- Thing (see Section 4.1.3 for more info in ranking in LTFL). ation process, ways of combining tags to solve problems Following the assumption that a specific tag used by many with synonyms and the like, size limits of the tag clouds, different users should theoretically be more useful for other and selection priorities of tags for the tag clouds in LTFL. people than a tag used many times but by only one user, the The responses were received on 4 December 2017. tags in our data collection were ranked first by the number The tags were compared and analysed based on their of members who have added the tag, and second by the content using tag categories that Golder and Huberman number of times the tag has been added. The data were identified in the Delicious bookmarking service (2006), downloaded from the website and library catalogues on the which can serve the following functions: 1) identifying same day, 5 November 2017. Since the tag clouds in the li- what (or who) the object is about; 2) identifying what it is brary catalogues delivered by LTFL are presenting the most (e.g., book, article, blog); 3) identifying who owns it; 4) re- popular tags on LibraryThing for that book (LibraryThing fining categories (tags that provide additional information for Libraries 2017), the 100 most popular tags for a book on supporting other tags); 5) identifying qualities of charac- LibraryThing should theoretically function as a sort of blue- teristics of the object; 6) self-reference (personal tags like print for the tag clouds in LTFL. This “blueprint” has been mystuff); and, 7) task organizing (for example “toread” or used in our study to analyse any deviations and identify tags “jobsearch”). that have not been included in LTFL. Since Delicious is different from LibraryThing because it aims at organizing websites rather than books, some 250 Knowl. Org. 46(2019)No.4 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds modifications to the categories were necessary for our pur- tefact, and mostly contain information that could be poses. While the first category of tags, identifying what (or found in a traditional library record. who) the object is about, works just as well on books as it 6. Unknown: tags written in languages other than Swedish does on bookmarks, the following two categories, describ- or English (i.e., unknown to the authors), unknown ab- ing what the object is and who owns it, were merged into breviations, codes and the like. This is a limitation to a single category containing tags that relate to the charac- the study, but the majority of the tags on LibraryThing teristics of the artefact, such as edition, publisher, and me- are in English and only a few percentages of the tags in dia form, since ownership of the book is irrelevant in a the study belong to this category. library setting. Category four, refining categories, was re- moved because no tags of this kind were found, while cat- The delimitations between the categories can be quite fluid egory five, identifying qualities or characteristics of the ob- in some cases, so a certain measure of subjective assess- ject, were left unchanged. Categories six and seven, self- ment has been necessary. Such are, for example, tags refer- reference and task organizing, were merged into one cate- ring to language or nationality; e.g., “English” could either gory of users’ personal tags, since no real difference be- mean that the user read the book in English, or that the tween the two were found on LibraryThing. In addition, book was written by an English author. As a third party, it two more categories were added; one for bibliographic is virtually impossible to determine exactly what purpose data, containing information independent of the literary the user had with the tag, so when the author’s nationality artefact such as author, title and publication year, and one was determined as English, the tag was assigned to cate- for foreign languages, unknown abbreviations, codes and gory five, bibliographic; in all other cases, it was assigned the like, as Thomas, Caudle and Schmitz did in their study to category two, artefact. (2009). The resulting six categories of tags used in this The following step was to code each tag with one of study are as follows: the appropriate categories from the list above. The tags were always examined in relation to the book they be- 1. Plot: Identifies who (characters, places, groups) or what longed to, and because of this a tag present in more than (time series, concepts, phenomena, events) the literary one tag collection could be assigned different categories work is about, for example, “wizards,” “regency era,” or depending on the context. Then, the tag categories were “Frodo Baggins.” analysed and compared as to how they differ between the 2. Artefact: Identifies manifestations of the particular two library catalogues and LibraryThing, studying the item that the user has read, for example, edition, owner changes across different books and characteristics (such as or publisher of the artefact, such as “e-book,” “penguin author nationalities, genres, years of publication, and tag classics,” “signed,” “library,” “first edition” or “pocket.” collection sizes), also taking into account the resulting Tags in this category focus solely on information re- qualitative impact on the description of the information garding the particular media or form of the literary ar- resource in the library catalogue. tefact; this is information that can vary greatly between different readers even if they are reading the same liter- 4.0 Results and analysis ary work. 3. Characteristics: genre, opinions or other characteristics, 4.1 Email survey such as “fantasy,” “classic,” “favourite,” “Nobel prize” or “inspiring.” The tags in this category focus on the This section presents the replies received via the email sur- content of the book but are expressed from the user’s vey related to the policies and practices for tag moderation personal views, perspectives and context, making them at LTFL. both personal and bibliographical at the same time. 4. Personal: user’s personal tags, such as “goodreads” or 4.1.1 The tag moderation process “to be read.” This category contain inherently personal tags that would be completely useless to anyone but the The tag moderation process is conducted by personnel at one creating them, while opinions such as “fun” have LibraryThing, and the goal of the tag moderation process been filed under category three, characteristics, since is to remove all personal tags from the tag clouds in the these could be argued to be of value to others as well library catalogue. The moderation process involves several (see Section 4.2.3) considerations. If the tag does not exist in the tag cloud in 5. Bibliographic: bibliographic data, such as year of first LTFL already, it is reviewed for inclusion. Only inherently publication, author, title, series, target audience and personal tags are excluded, like “left it at mom’s house.” original language. These tags refer to the literary work, The receiving library may choose to use a filter to clear out regardless of form, edition and other aspects of the ar- potentially inappropriate terms; however, most libraries Knowl. Org. 46(2019)No.4 251 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds choose not to modify the tag clouds. Since the tag moder- LTFL, the system will select the tag from the tag cloud that ation is done manually it usually takes a while (no further is also most widely used elsewhere on LibraryThing. In this details were given in the survey) before the tag is approved. study we have strived to apply the same principle when On the other hand, if the proposed tag has been added ranking the tags for each of the books by using tag data previously, the system automatically updates the popularity available on LibraryThing’s webpage (see 3.2.3). rating. The assumption behind this decision is that the larger the number of users who add the same tag to the 4.2 Tag analysis book, the larger the chance that the tag is relevant. How- ever, when importing tags into a library catalogue, a new This section presents the distribution of tags for adult context arises in terms of end users and the catalogue books, and children and young adults’ books respectively, structure, which potentially implies that tags considered divided into the categories presented in Section 3.3, both valuable in LibraryThing might be useless in the library for the 100 most popular tags per book and for the two catalogue, for example “literary fiction” and “tales,” since different tag cloud sizes of twenty-five and fifteen tags. this information is already provided by the library cata- Following that is an analysis, category by category, of the logue. differences found between the LFTL-tags in the library catalogue and on LibraryThing’s website. Lastly, the con- 4.1.2 Combining tags in LibraryThing sequences of the different size limitations of the tag clouds are explored and evaluated. LibraryThing supports linking tags based on relationships of synonymy. Tag combining can be suggested by any user 4.2.1 Distribution of tags for adult books on LibraryThing, and anyone can participate in the voting process for new combinations, but for a new combination Table 3 below shows distribution of the 100 most popular to pass it must receive four times as many positive votes as tags (collected from LibraryThing’s website) per book for negative ones, win by at least eight votes and the voting the adult books, divided into the different categories pre- must have been open for at least a week. Tag combining sented in Section 3.3, as well as the distribution of LTFL refers to merging synonymous terms under one main tag tags in these categories in the different tag cloud sizes of on a global level, such as, for example, when viewing tags twenty-five (collected from Spokane County Library Dis- assigned by all users on LibraryThing for a certain book. trict) and fifteen tags (collected from South Central Library The tag “WWII,” e.g., has more than 800 different aliases, System) respectively. like “ww2” and “second world war,” and including all these The most popular tags on LibraryThing in books for tags under one main keyword greatly improves the experi- adults are those found in category one, plot (26.0% in av- ence of viewing tag clouds and using tags on a global level erage), followed by category three, characteristics, with on LibraryThing. The process does not affect the user’s 23.0% in average, and then category five, bibliographic personal tags. However, based on the data collected in this (21.3% in average). The distribution of tags for adult study, several examples of popular tags which are not books in LTFL is similar. Categories two, four and six, linked exist, such as “children’s literature,” and “children’s largely contain tags that LibraryThing strives to remove books” that appear in parallel in a tag cloud of a book. from LTFL, and only one tag from category two, artefact, Similarly, “fantasy” and “fantasy fiction” tags occur to- and none from category four, personal, and category six, gether. unknown, can be found in the library catalogues. While it is relatively easy to name differences between 4.1.3 Selection priorities of tags for LTFL the 100 most popular tags on LibraryThing and the tag clouds in LTFL in general, it is much harder to describe Each library can customize the display of tags, including how the tag clouds with twenty-five and fifteen tags differ how many to show. The tag clouds may contain, five, ten, from each other. The distribution of tags across the cate- fifteen, twenty, twenty-five or thirty of the most popular gories seem to be relatively consistent when scaling down tags in a visual display. When selecting tags for LTFL, Li- from twenty-five to fifteen tags, with minor differences: braryThing first ranks the tags based on the number of while the twenty-five-tag clouds contained some tags from different users assigning a tag to a specific book, and sec- category two, four and six, none are present in the tag ondly, tags from the tag cloud that are also most popular clouds with fifteen tags. However, the description of the overall in LibraryThing are chosen. An example illustrates information resource seems to be less defined with smaller this: a book has been tagged four times with “friendship,” tag clouds. A deeper analysis of this can be found in Sec- four times with “love” and four times with “magic.” If tion 4.2.4. only one more tag is required to complete the tag cloud in 252 Knowl. Org. 46(2019)No.4 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

CATEGORIES

1 2 3 4 5 6 Plot Artefact Characteristics Personal Bibliographic Unknown

/100 /25 /15 /100 /25 /15 /100 /25 /15 /100 /25 /15 /100 /25 /15 /100 /25 /15 Douglas Adams – The ultimate hitchhiker’s guide to the galaxy: 21 5 3 10 - - 32 10 8 13 - - 24 10 4 - - - Julia Alvarez – How the García girls lost their accents: 34 15 11 9 - - 15 2 1 15 - - 27 8 3 - - - Margaret Atwood – The handmaid’s tale: 26 8 4 10 - - 32 10 7 13 - - 19 7 4 - - - Jane Austen – Pride and prejudice: 25 9 3 10 - - 27 7 4 13 - - 25 9 8 - - - Dan Brown – The lost symbol: 34 16 10 12 - - 21 8 4 16 - - 16 1 1 1 - - Catharina Ingelman-Sundberg – The little old lady who broke all the rules: 20 10 8 10 1 - 21 10 5 24 - - 17 4 2 8 - - Kazuo Ishiguro – The buried giant: 33 13 6 11 - - 22 6 3 14 - - 19 6 6 1 - - Jo Nesbø – The bat: 20 9 7 14 - - 21 12 7 18 - - 23 4 1 4 - - Orhan Pamuk – A strangeness in my mind: 23 9 8 14 - - 12 8 3 20 - - 19 8 4 12 - - J.R.R. Tolkien – The fellowship of the ring: 24 8 7 14 - - 27 10 6 11 - - 24 7 2 - - -

AVERAGE:

26 10.2 6.7 11.4 0.1 - 23 8.3 4.8 15.7 - - 21.3 6.4 3.5 2.6 - -

AVERAGE IN PERCENT:

26.0% 40.8% 44.7% 11.4% 0.4% - 23.0% 33.2% 32.0% 15.7% - - 21.3% 25.6% 23.3% 2.6% - -

Table 3. Adult books: distribution across the six categories of the 100 most popular tags per book on LibraryThing’s website, and of the twenty-five and fifteen LTFL tag clouds.

4.2.2 Distribution of tags for children and young largely contain tags that LibraryThing strives to remove adults’ books from LTFL, and only one tag from category two, artefact, and none from category four, personal, and category six, As seen from Table 4 below, the most popular tags on Li- unknown, can be found in the library catalogues. braryThing assigned to books for children and young When comparing the two different sizes of tag clouds in adults belong to category one, plot, and category five, bib- LFTL, they are similar to those of the adult books; the dis- liographical, with almost the same average: 28.7% and tribution of tags across the categories seems to be relatively 27.2% respectively. Category three, characteristics (15.6%), consistent when scaling down from twenty-five to fifteen and category four, personal (14.5%) split the second place tags, but with slightly bigger fluctuations. Just like in the re- in popularity on LibraryThing. The same patterns are sults for the adult books, no tags from category two, four found in the LTFL tags, except that category four, per- and six are present in the tag clouds with fifteen tags. sonal, does not appear at all. Categories two, four and six, Knowl. Org. 46(2019)No.4 253 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

CATEGORIES

1 2 3 4 5 6 Plot Artefact Characteristics Personal Bibliographic Unknown

/100 /25 /15 /100 /25 /15 /100 /25 /15 /100 /25 /15 /100 /25 /15 /100 /25 /15 Frances Hodgson Burnett – A little princess: 31 11 5 9 - - 18 3 3 13 - - 28 11 7 1 - - Orson Scott Card – Ender’s game: 30 10 6 10 - - 22 10 5 17 - - 21 5 4 - - - Roald Dahl – Charlie and the chocolate factory: 21 5 3 8 - - 19 5 4 13 - - 33 15 8 6 - - John Green – The fault in our stars: 37 13 9 11 - - 14 5 2 17 - - 20 7 4 1 - - Tove Jansson – Tales from Moominvalley: 15 5 2 11 - - 13 4 1 10 - - 43 16 12 8 - - Lene Kaaberbøl – The Shamer’s daughter: 38 13 8 6 - - 12 4 2 19 - - 22 8 5 3 - - Astrid Lindgren – Ronia, the robber’s daughter: 27 8 4 9 1 - 17 3 2 11 - - 26 13 9 10 - - Mary Pope Osborne – Dinosaurs before dark: 17 9 6 5 - - 13 7 4 22 - - 26 9 5 17 - - J. K. Rowling – Harry Potter and the sorcerer’s stone: 30 9 7 10 - - 18 3 1 9 - - 33 13 7 - - - Annika Thor – The lily pond: 41 16 9 2 - - 10 4 2 14 - - 20 5 4 13 - -

AVERAGE:

28.7 9.9 5.9 8.1 0.1 - 15.6 4.8 2.6 14.5 - - 27.2 10.2 6.5 5.9 - -

AVERAGE IN PERCENT:

28.7% 39.6% 39.3% 8.1% 0.4% - 15.6% 19.2% 17.3% 14.5% - - 27.2% 40.8% 43.3% 5.9% - -

Table 4. Children and young adults’ books: distribution across the six categories of the 100 most popular tags per book on LibraryThing’s website, and of the twenty-five and fifteen LTFL tag clouds.

The difference between the distribution of tag categories 4.2.3 Analysis of tag distributions in adult books and books for children and young adults may be explained by the large number of tags that allude This section analyses the data presented in the above ta- to the target audience found among the tags for the chil- bles, category by category, and compares similarities and dren and young adults’ books, such as “children’s,” “chil- differences between the tags on LibraryThing with the tag dren’s literature,” “children’s books,” “children’s fiction,” clouds in LTFL collected from the library catalogues. “juvenile,” “juvenile fiction,” “YA,” “young adult,” “young All of the tags belonging to category one, plot, focus adult literature,” and “kids,” a type of tag that is rarely on the content of the book and are, therefore, of value to found in adult books. end-users other than the tag creator (Golder and Huber- man 2006). Furthermore, since the tags in this category, for example “marriage” or “wizards,” contain information that appears for literary fiction in a very limited number of 254 Knowl. Org. 46(2019)No.4 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds subject headings in the library catalogues, and since literary tional subject indexing systems, mainly because they are fiction can be quite subjective and multifaceted, these tags not limited to a few keywords. Even if the tag cloud were may be considered a valuable addition to the library cata- to contain duplicates of the subject headings in the library logue. By comparing the 100 most popular tags on Library- catalogue, they still provide the possibility of a more nu- Thing with the tags in LTFL, it seems that a few tags from anced and detailed description of the contents of the in- category one, plot, have been excluded from LTFL in the formation resource. At the same time, some tag clouds in tag moderation process; among these are “journey,” “tree- LTFL contain one or more synonymous genre tags, such house,” and “Ender” (main character in “Ender’s game,” by as “fantasy” and “fantasy fiction,” producing unnecessary Orson Scott Card). According to LibraryThing’s policies and redundant information in LTFL. on tag moderation and the tags popularity ranking these Category four, personal, is not included at all in LTFL, should be present in LTFL. Why this is not the case could which is positive since tags in this category are entirely cre- simply be about the human factor since the tag moderation ated for the user’s personal use. Tags such as “own,” “read process is performed manually. in 2015” and “at home” are completely irrelevant to other Category two, artefact, contains tags referring to a spe- people and have no function in a library catalogue (Golder cific edition, copy, or media form, and this category exists and Huberman 2006). A very small ratio of personal tags due to the fact that a tag cloud for a book on LibraryThing is ranked among the top thirty most popular tags on Li- is completely independent of the copy. This causes users braryThing in our sample; only the books with small tag to assign tags to describe the physical properties of the collections have a lot of personal tags among the top-rank- copy. The library catalogue, however, specifies edition and ing tags. Given the fact that libraries often have large varied form in the library catalogue record, so the fact that Li- collections, which include books, that have both large and braryThing removes these tags from LTFL should be small tag clouds on LibraryThing, the tag moderation pro- viewed as positive since they would otherwise give mislead- cess for this category improves the quality of the resulting ing and/or redundant information to the library catalogue tag clouds. visitor. Only two tags from this category appear in our The tags in category five, bibliographic, mainly contain LTFL tag collection: the tag “foreign” for The Little Old information that is already present in the catalogue. How- Lady Who Broke All the Rules, and “German” for Ronia, the ever, some of the tags could complement the library cata- Robber’s Daughter. The problem of language and nationality logue, for example, information about author’s nationality. mentioned above (Section 3.3) relates to tags that could Furthermore, we have discovered that this category contains refer to both the written language of the book and/or the some tags that, according to the LibraryThing tag modera- author’s nationality. For example, if a Swedish library used tion process, should have been included in LTFL but have LTFL, the tag “foreign” for Ronia, the Robber’s Daughter, instead been removed. Publishing years, such as “2009” could give misleading information, because the author and have not been included at all in LTFL but occur frequently original language of the book is Swedish. on LibraryTing’s website, one may speculate that the reason The tags in category three, characteristics, focus on the for this is that for a third party it is impossible to determine content of the book but are expressed from the user’s per- what the purpose the user had with a certain tag. Other ex- sonal purposes, views, perspective and context. These are amples of discrepancies have been found in LTFL regarding considered by Golder and Huberman (2006) to be primar- several popular concepts, which have been included in the ily useful to the person who created them, while Rolla tag clouds for some books and excluded for others. For ex- (2009) argued that personal tags could be of potential use ample, some authors’ surnames such as “Tolkien” and to other people as well. A number of tags belonging to this “Rowling” have been excluded from LTFL while “Austen” category have been excluded from LTFL in the modera- and “Atwood” are included. Names of book series are an- tion process, such as “favourites,” “children’s classics,” other inconsistency; e.g., “Harry Potter series” has been in- “1001 books,” “badass” and “memorable.” While some of cluded in LTFL while “Lord of the Rings” (the name of the these tags might be less useful in a library catalogue, e.g., fantasy series by J.R.R. Tolkien) was removed. Nationality is “badass,” certain tags may be valuable; for example, a third yet another variable: for example, “Finnish” has been dis- party may want to know that 307 people thought Harry carded from Tales from Moominvalley in the library catalogue, Potter and the Sorcerer’s Stone was so good that it became one while “Turkish” is included in A Strangeness in My Mind. of their favourites, and that Pride and Prejudice is one of the Category six, unknown, includes all tags in languages books that got the status “1001 books you have to read other than Swedish or English, unknown abbreviations, before you die.” Genre tags, which are also included in this codes and the like. The number of tags in category six, category, usually pass the tag moderation for LTFL. They unknown, is relatively small, 2.6% for the adult books and can, in theory, be valuable in the library catalogue since 5.9% for the children and young adult books, and they are they can provide a more extensive description than tradi- rarely found among the most popular tags. Knowl. Org. 46(2019)No.4 255 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

4.2.4 Tag clouds in LTFL with fifteen and ics of cancer, death, friendship, sadness, illness, love, rela- twenty-five tags tionships and possibly involves Amsterdam. When another ten tags are added to the tag cloud (Figure 5), the content The fifteen and twenty-five tag clouds can contain many and character of the plot become more specified and imply similarities and differences regarding the content and de- a love story where one person is sick and dying of cancer. scription of the book. The following three examples illus- Death seems to be a major theme alongside those of family, trate how limitations of the tag clouds, and the total size of love, humour and aging; two contemporary places are im- the tag collection, can affect the description of the book: portant to the story, namely Amsterdam and Indiana. How- ever, at the same time, the larger tag cloud contains more – A book with a very large tag collection on Library- problematic tags, including synonyms and bibliographic in- Thing: Pride and Prejudice by Jane Austen; formation already in the library catalogue, such as four dif- – A book with a moderate tag collection on LibraryThing: ferent tags denoting the target audience (category five, bib- The Fault in Our Stars by John Green; and, liographic), and two tags describing the book as contempo- – A book with a small tag collection on LibraryThing: The rary (category three, characteristics). Little Old Lady Who Broke All the Rules by Catharina Ingelman-Sundberg. 4.2.4.3 The Little Old Lady Who Broke All the Rules by Catharina Ingelman-Sundberg 4.2.4.1 Pride and Prejudice by Jane Austen This easy-going crime comedy is written by a Swedish au- The English writer Jane Austen’s most famous novel Pride thor, Catharina Ingelman-Sundberg, and was published in and Prejudice was first published in 1813 and has over 2,000 Swedish for the first time in 2012. It was translated into different tags on LibraryThing. As seen from Figure 2, the English in 2014 and has almost 200 different tags on Li- fifteen-tag cloud shows that this is English literature written braryThing. by Jane Austen in the 19th century and the regency era. It is The fifteen-tag cloud shown in Figure 6 implies a hu- seen as a classic and the only tags that give any information morous crime-fiction written by a Swedish author. The about the book topics are “England,” “love” and “ro- plot is denoted as taking place in contemporary Sweden mance.” and seems to include adventure, mystery, older people and As seen from Figure 3 and the twenty-five-tag cloud, some kind of a robbery. The tag cloud with twenty-five when another ten tags are added, the picture of the con- tags (Figure 7) clarifies the plot somewhat, but not to the tent is enriched. The book’s plot is described as a Victorian same extent as in the previous two examples. Instead, the historical romance with humour, that takes place in Eng- tag cloud is filled with synonymous and closely related land, where women, marriage and family, especially sisters, terms, for example “elderly,” “old age,” “old people,” “re- play a major role, and one of the characters is Mr. Darcy. tirement,” “senior citizens,” and “seniors” (category one, Thus, the twenty-five-tag cloud gives a clearer description plot), “humor” and “humorous” (category three, charac- of the contents of the book. teristics), “Swedish author,” “Swedish literature” and When it comes to tag category distribution, both tag “Swedish” (category five, bibliographic). clouds comprise four tags that refer to the British/English In summary, the three examples seem to illustrate that origins of the book (category five, bibliographic), three the larger the tag cloud, the more developed and defined tags describe it as a classic (category three, characteristics), the content of the book seems to become. However, at the and two tags denote the author’s name (category five, bib- same time, larger tag clouds tend to lead to more redun- liographic). The twenty-five-tag cloud additionally con- dant tags that are essentially synonymous. tains two tags regarding both the 19th century (category five, bibliographic) and the historical literature (category 4.2.4.4 Tag clouds in LTFL with fifteen and three, characteristics). twenty-five tags

4.2.4.2 The Fault in Our Stars by John Green The ultimate mission of LibraryThing is to provide users with a personal catalogue, a reading list or an overview of American author John Green published the book The Fault their home library (LibraryThing 2017). Their users each in Our Stars in 2012. The book is aimed primarily at young have their own individual purposes, goals, perspectives, vo- adults and has just under 1000 different tags on Library- cabularies, methods and contexts. They have often read the Thing. As seen from Figure 4, the fifteen-tag cloud informs book they assign tags to, and the LibraryThing system al- the visitor that the book is a realistic fiction story aimed at lows a high degree of personal freedom. This leads in the young adults. It is set in the present time and addresses top- end to each person’s LibraryThing catalogue being a 256 Knowl. Org. 46(2019)No.4 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

Figure 2. Tag cloud with fifteen tags for Pride and Prejudice.

Figure 3. Tag cloud with twenty-five tags for Pride and Prejudice.

Figure 4. Tag cloud with fifteen tags for The Fault in Our Stars.

Figure 5. Tag cloud with twenty-five tags for The Fault in Our Stars.

Figure 6. Tag cloud with fifteen tags for The Little Old Lady Who Broke All the Rules.

Figure 7. Tag cloud with twenty-five tags for The Little Old Lady Who Broke All the Rules. unique system. Furthermore, LibraryThing is a social plat- described with systematic and detailed bibliographic infor- form whereby seeing the complete tag cloud of the book as mation, with the system being specifically structured to offer well as the ability to merge tags with the same meaning, al- library services and access to its collections. Therefore, mov- lows a single tag to be used by thousands of users. On the ing the tags from LibraryThing context to that of a library other hand, in the library catalogue each book is uniquely catalogue is not a straightforward endeavour. Knowl. Org. 46(2019)No.4 257 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

The identified problems of multiple synonymous tags According to LibraryThing, only inherently personal and tags denoting already existing bibliographic infor- tags are excluded in the tag moderation process, for exam- mation are hard to avoid. However, based on the above ple “left it at mom’s house.” Results show that while Li- analysis, it seems that tags are adding valuable information braryThing claim to only remove the inherently personal to the library catalogue. Category one, plot, focuses on the tags, this seems to be only partly true since some other tags content of the book; since this type of information typi- are absent in the library catalogue as well. In addition, in- cally appears in a rather limited number of general subject consistencies have been identified where a certain type of headings, and since literary fiction can be quite subjective tag, for example author, have been removed from some and multifaceted, tags such as “marriage” and “wizards” books in LTFL but included in others. Furthermore, some are a valuable addition. Tags in category three, characteris- tags could be valuable to other people despite their per- tics, describe the contents of the book from the user’s own sonal nature, such as tags describing opinions or attributes perspective, and while they are personal in nature, they can of an information resource, for example, “favourite” and still be valuable to other people, for example, “children’s “1001 books;” however, these have often been removed by classics” and “favourites.” Category five, bibliographic, LibraryThing. contains a lot of information that is already in the library The tag clouds in LTFL mainly consist of tags belonging catalogue, but also some valuable complements, such as to category one, plot, category three, characteristics, and “Swedish author” and “British.” The remaining three cat- five, bibliographic. According to LibraryThing, only inher- egories (two, artefact; four, personal, and six, unknown) ently personal tags are removed in the tag moderation pro- are not suited to the library context, as described above in cess; however, the sample in this study reveals that some Section 4.2.3; however, these have been largely removed other tags have been excluded from LTFL as well. Examples from LTFL in the tag moderation process. can be found in category one, plot, where “journey,” “tree- When it comes to size limitations, as seen above, the house” and “Ender” where absent in the tag clouds in larger size of the tag clouds in LTFL seems to contribute LTFL, and in category five, bibliographic, where “fiction,” to a clearer and more comprehensive description of the “novel” and “series” did not get approved for LTFL. Inter- book, but at the same time the number of redundant tags estingly, some types of tags have been approved in LTFL also increases. One consequence of LTFL tag cloud size for some books, while they are absent in other books’ tag limitation in each individual record in the library catalogue clouds in LTFL. For example, some authors’ surnames such is that minorities and divergent perspectives are ruled out, as “Tolkien” and “Rowling” have been excluded from LTFL particularly for books with larger tag clouds, since only the while “Austen” and “Atwood” are included. most popular tags are included. One possible solution to The size of the tag clouds seems to affect the represen- this limitation is to create the same function in LTFL that tation of the contents of an information resource, where already exists on LibraryThing’s website, where users have a larger tag cloud seems to give a more extensive descrip- the option to view the complete tag cloud by choosing an tion of the resource, yet this also increases the number of option to expand the view of the tags, something also tags with synonymous or redundant information due to mention by Pirmann (2012). This would provide visitors the differences in context between the tags original pur- of the library catalogue an opportunity to investigate the pose on LibraryThing, and the library catalogue. Library- book’s properties and content more closely, regardless of Thing has tried to lessen the problem of synonymous tags the size limitation their library has chosen for the tag by allowing their users to connect tags with the same clouds. meaning under one main keyword; however, the sample data in this study still contain a lot of synonyms and closely 5.0 Conclusion related terms. While this study has been limited to twenty books of lit- Since a relatively large amount of tags is needed in order erary fiction with tags collected from two libraries, planned to create a stable pattern and to reach a general consensus future research would include larger samples and compari- regarding the description of an information resource, it son across different genres and topics. Another topic of in- might be advantageous for a library catalogue to import terest would be to study the merging of synonymous tags, tags from an external source, such as LibraryThing, rather and how this affects the library catalogue. Redundant tags than to provide its own service. We aimed to determine are impossible to avoid when tags are moved from their how the tag moderation process of LibraryThing and tag original context (on LibraryThing); however, it would be in- cloud sizes of its LTFL service affect the representation teresting to study whether and how these can be minimized of an information resource, and which differences there in future. Furthermore, a comparison against existing sub- are between the tag clouds on LibraryThing and those im- ject headings systems used in libraries would better illustrate ported to libraries. the benefits of end-user tagging. Finally, all of this should 258 Knowl. Org. 46(2019)No.4 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds be put in a larger context of how imported tags affect infor- Guy, Marieke and Emma Tonkin. 2006. “Folksonomies: Ti- mation retrieval in library catalogues, to determine their ac- dying up Tags?” D-Lib Magazine 12, no. 1. doi: 10.1045/ tual value in a real-life context. january2006-guy Kakali, Constantia. 2014. “A Utilization Model of Users’ References Metadata in Libraries.” Journal of Academic Librarianship 40: 565-73. doi: 10.1016/j.acalib.2014.08.004 Adler, Melissa. 2009. “Transcending Library Catalogs: A Kipp, Margaret E. I. 2011. “Tagging of Biomedical Arti- Comparative Study of Controlled Terms in Library of cles on CiteULike: A Comparison of User, Author and Congress Subject Headings and User-Generated Tags Professional Indexing.” Knowledge Organization 38: 245- in LibraryThing for Transgender Books.” Journal of Web 61. doi: 10.5771/0943-7444-2011-3-245 Librarianship 4: 309-31. doi: 10.1080/19322900903341 Kipp, Margaret E. I., Jihee Beak and Ann M. Graf 2015. 099 “Tagging of Banned and Challenged Books.” Knowledge Anfinnsen, Svein, Gheorghita Ghinea and Sergio de Cesare. Organization 42: 276-83. doi: 10.5771/0943-7444-2015-5- 2011. “Web 2.0 and Folksonomies in a Library Context.” 276 International Journal of Information Management 31: 63-70. Kipp, Margaret E.I. and D. Grant. Campbell. 2010. “Search- doi: 10.1016/j.ijinfomgt.2010.05.006 ing with Tags: Do Tags Help Users Find Things?” Bates, Jo and Jennifer Rowley. 2011. “Social Reproduction Knowledge Organization 37: 239-55. and Exclusion in Subject Indexing: A Comparison of LibraryThing. 2017. “About LibraryThing.” Accessed 21 Public Library OPACs and LibraryThing Folksonomy.” September. https://www.librarything.com/about Journal of Documentation 67: 431-48. doi: 10.1108/00220 LibraryThing. 2018. “Zeitgeist.” Accessed 3 October. 411111124532 https://se.librarything.com/zeitgeist Fox, Melodie J. 2012. “Communities of Practice, Gender LibraryThing for Libraries. 2017. “About LibraryThing for and Social Tagging.” In Categories, Relations and Contexts in Libraries.” Accessed 21 September. http://www.library Knowledge Organization: Proceedings of the Twelfth International thing.com/forlibraries/about ISKO Conference 6-9 August 2012 Mysore, India ed. A. Lin, Xia, Joan E. Beaudoin, Yen Bui and Kaushal Desai. Neelameghan and K. S. Raghavan. Wurzburg: Ergon, 2006. “Exploring Characteristics of Social Classifica- 352-8. tion.” 17th Annual ASIS&T SIG/CR Classification Re- Fox, Melodie J. and Austin Reece. 2013. “The Impossible search Workshop, 1-19. doi: 10.7152/acro.v17i1.12491 Decision: Social Tagging and Derrida’s Deconstructed Lu, Chao, Chengzhi Zhang and Daqing He. 2016. “Com- Hospitality.” Knowledge Organization 40: 260-5. doi: parative Analysis of Book Tags: A Cross-Lingual Per- 10.5771/0943-7444-2013-4-260 spective.” The Electronic Library 34: 666-82. doi: Furner, Jonathan. 2010. “Folksonomies.” In Encyclopedia of 10.1108/EL-03-2015-0042 Library and Information Sciences, ed. Marcia J. Bates and Munk, Timme Bisgaard and Kristian Mørk. 2007a. “Folk- Mary Niles Maack. Boca Raton, FL: CRC Press, 1858-66. sonomies, Tagging Communities and Tagging Strategies: Gerolimos, Michalis. 2013. “Tagging for Libraries: A Re- An Empirical Study.” Knowledge Organization 34: 115-27. view of the Effectiveness of Tagging Systems for Li- Munk, Timme Bisgaard and Kristian Mørk. 2007b. “Folk- brary Catalogs.” Journal of Library Metadata 13: 36-58. sonomy, the Power Law & the Significance of the Least doi: 10.1080/19386389.2013.778730 Effort.” Knowledge Organization 34: 16-33. Golder, Scott A. and Bernardo A. Huberman. 2006. “Us- Pirmann, Carrie M. 2012. “Tags in the Catalogue: Insights age Patterns of Collaborative Tagging Systems.” Journal from a Usability Study of LibraryThing for Libraries.” of Information Science 32: 198-208. doi: 10.1177/016555 Library Trends 61, no. 1: 234-47. doi: 10.1353/lib.2012. 1506062337 0021 Golub, Koraljka, Marianne Lykke and Douglas Tudhope. Rafferty, Pauline. 2011. “Informative Tagging of Images: 2014. “Enhancing Social Tagging with Automated Key- The Importance of Modality in Interpretation.” words from the Dewey Decimal Classification.” Journal Knowledge Organization 38: 283-98. of Documentation 70: 801-28. doi: 10.1108/JD-05-2013- Rafferty, Pauline. 2018. “Tagging.” Knowledge Organization 0056 45: 500-16. Golub, Koraljka. 2016. “Potential and Challenges of Sub- Ram, Shri. 2015. “Tag Cloud Application and Information ject Access in Libraries Today on the Example of Swe- Retrieval System: Visualization to Create Information dish Libraries.” International Information & Library Review Literacy.” DESIDOC Journal of Library & Information 48: 204-10. doi: 10.1080/10572317.2016.1205406 Technology 35: 41-6. doi: 10.14429/djlit.35.1.8036 Rolla, Peter J. 2009. “User Tags Versus Subject Headings: Can User-Supplied Data Improve Subject Access to Li- Knowl. Org. 46(2019)No.4 259 S. Johansson and K Golub. LibraryThing for Libraries: How Tag Moderation and Size Limitations Affect Tag Clouds

brary Collections?” Library Resources & Technical Services Westcott, Jezmynne, Alexandra Chappell and Candace 53: 174-84. Lebel. 2009. “LibraryThing for Libraries at Claremont.” Spiteri. Louise F. 2006. “The Use of Folksonomies in Pub- Library Hi Tech 27: 78-81. doi: 10.1108/0737883091094 lic Library Catalogues.” The Serials Librarian 51: 75-89. 2937 doi: 10.1300/J123v51n02_06 Voorbij, Henk. 2012. “The Value of LibraryThing Tags for Spiteri, Louise F. and Jen Pecoskie. 2016. “In the Readers’ Academic Libraries.” Online Information Review 36: 196- Own Words: How User Content in the Catalog Can En- 217. doi: 10.1108/14684521211229039 hance Readers’ Advisory Services.” Reference & User Ser- Wu, Dan, Xiaomei Xu and Wenting Yu. 2016. “Comparing vices Quarterly 56: 91-5. Collaborative Annotations on Books between Libraries Steele, Tom. 2009. “The New Cooperative Cataloging.” Li- and Social Community Sites: A Case Study.” The Elec- brary Hi Tech 27: 68-77. doi: 10.1108/07378830910942 tronic Library 34: 178-95. doi: 10.1108/EL-09-2014-01 928

260 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

Astronomy Classification: Towards a Faceted Classification Scheme Emma Quinlan* and Pauline Rafferty** *Nuffield College Library, New Road, Oxford, OX1 1NF, **Aberystwyth University, Department of Information Studies, Aberystwyth SY23 3UX,

Emma Quinlan is an assistant librarian at Nuffield College, University of Oxford, with archive duties. Her main responsibilities include maintaining and facilitating access to the archive collections, managing inter-library loan requests and cataloguing and classification of book stock. As a newly qualified information professional, her research interests include bibliographic classification and MARC21 cataloguing with particular interest in faceted classification and scientific classification schemes. She holds a BSc in observational astronomy from the Univer- sity of Glamorgan and an MA in information and library studies from Aberystwyth University. This is her first piece of academic output.

Pauline Rafferty is Senior Lecturer at Aberystwyth University, teaching knowledge organization and representa- tion, information architectures, and qualitative approaches to research. Her research interests include knowledge organisation and cultural documentation, and critical communication and information studies. Specific areas of interest include popular culture in and through the web, the democratisation of critical authority, and participative digital cultural production. She holds a PhD in critical theory and cultural studies. Pauline is Joint Editor of Journal of Information Science and Joint Regional Editor of The Electronic Library. She has co-edited books for Facet and co-written a book for Ashgate.

Quinlan, Emma and Pauline Rafferty. 2019. “Astronomy Classification: Towards a Faceted Classification Scheme. Knowledge Organization 46(4): 260-278. 59 references. DOI:10.5771/0943-7444-2019-4-260.

Abstract: Astronomy classification is often overlooked in classification discourse. Its rarity and obscurity, espe- cially within UK librarianship, suggests it is an underdeveloped strand of classification research and is possibly undervalued in modern librarianship. The purpose of this research is to investigate the suitability and practicali- ties of the discipline of astronomy adopting a subject-specific faceted classification scheme and to provide a provisional outline of a special faceted astronomy classification scheme. The research demonstrates that the application of universal schemes for astronomy classification had left the interdisciplinary subject ill catered for and outdated, making accurate classification difficult for specialist astronomy collections. A faceted approach to classification development is supported by two qualitative literature-based research methods: historical research into astronomy classification and an analytico-synthetic classification case study. The subsequent classification development is influenced through a pragmatic and scholarly-scientific approach and constructed by means of instruction from faceted classification guides by Vickery (1960) and Batley (2005), and faceted classification principles from Ranaganathan (1937). This research fills a gap within classification discourse on specialist interdisciplinary subjects, specifically within as- tronomy and demonstrates the best means for their classification. It provides a means of assessing further the value of faceted classification within astronomy librarianship.

Received: 5 March 2019; Revised: 7 May 2019; Accepted: 20 May 2019

Keywords: classification, astronomy, subject, notation, classification scheme, faceted

1.0 Introduction and location. It is with this access in mind that this research provides its practicality by creating a means of classifying 1.1 Background astronomy collections through a new subject-specific clas- sification scheme. Corbin states that (2003, 145): “astronomers have a de- The scientific discourse of researchers in the field of as- pendence on the librarian to make needed information ac- tronomy is of some interest to knowledge organization re- cessible.” Librarians are the key to providing the means of searchers, for example Ibekwe-SanJuan’s (2008) research re- access to these versatile collections. As well as providing vealed the role that geographic location can play in the de- catalogues in which to search for astronomical infor- velopment of the field and also in the distribution of termi- mation, classification of these materials provides the sec- nology. In relation to bibliographic classification, Corbin ondary access point which enables information retrieval (2003, 142) found that most special astronomy libraries clas- Knowl. Org. 46(2019)No.4 261 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme sify their materials either using an in-house scheme, devel- spite the comprehensiveness and variety of certain oped specifically for their collections, or sub-sections of general schemes, they do not fully cater for the spe- universal classification schemes. Globally, the three most cial viewpoints of each particular library or infor- common classification schemes used in astronomy libraries mation centre. Third, even if they are varied in view- are Library of Congress Classification (LCC), Dewey Decimal point, they do not sufficiently provide for the flexi- Classification (DDC), and Universal Decimal Classification ble combination of terms which highly specific sub- (UDC) (Corbin 2003, 142). While there are no comprehen- ject headings demand. Fourth, even if flexible, they sive astronomy classification schemes to date, there have achieve such flexibility only by unnecessarily lengthy been previous attempts to create a scheme. The Physics and or complicated notational means. Fifth, they fail to Astronomy Classification Scheme (PACS), developed by the give optimum helpfulness in filing order. American Institute of Physics (AIP), was used to classify online journals and indexes, databases, and astronomy and This perspective is supported by Vickery (1960) and physics catalogues (Hider and Harvey 2008, 126), before it Herner and Meyer (1957) when discussing how quickly was replaced by the AIP Thesaurus (Access Innovation Inc. evolving and complex subject areas have outgrown univer- 2018). The move from the PACS to the AIP Thesaurus has sal classification schemes. left an opening within astronomy classification for a new Herner and Meyer (1957, 801) found seven requirements comprehensive astronomy scheme. for the creation of a specialist classification scheme: 1) A review into discourse on classification theory empha- terms used must reflect the current use of language within sises two main classification types; enumerative, which the subject area (hospitable); 2) the scheme must be suitable have a one-dimensional top-down organisational approach for the type and purpose of literature (flexible); 3) terms and with predefined notation (Losee 1995), and faceted, which classes used must be distinguishable in meaning and content are broken down into their constituent parts starting with (unique); 4) classification structure should allow for equal the subject fields (Broughton 2015) and further divided by distribution of documents over an easy to see structure “facets” with unset notation (Chowdhury and Chowdhury (simple); 5) the scheme must cater for the addition of new 2007). There is ambiguity surrounding the term “analytico- subject matter (hospitable); 6) notation must be consistent synthetic classification;” the term originating as a synonym and easily recognised and deduced (brevity, mnemonic); for “faceted classification.” La Barre (2006), discussing the and, 7) the scheme must group comparable subjects to- historical development of faceted analytico-synthetic the- gether and use hierarchy for user needs (expressive). The re- ory (FAST), notes that while work on FAST began in the quirements satisfy classification discourse (Berwick Sayers 1930s, it was not codified until Ranganathan published Pro- 1955; Chowdhury and Chowdhury 2007; Broughton 2015; legomena to Library Science in 1957. Her findings, which Hunter 2018) in that notations need to assume the following showed how FAST underpins the design of informational qualities: expressiveness, mnemonics, simplicity, uniqueness, and promotional websites, demonstrates the ongoing use- brevity, flexibility, and hospitality. fulness and significance of FAST in the digital information age. Recent literature by Chowdhury and Chowdhury 1.2 Purpose of research (2007), Chowdhury (2010), and Broughton (2015) identi- fies analytico-synthetic as a new classification type, con- The purpose of this research is to investigate the suitability taining both enumerative and faceted features allowing the and practicalities of the discipline of astronomy adopting content of an item to be split into its component parts a subject-specific faceted classification scheme and offers (analysed; enumerative) and then a class mark to be built the beginnings of such a specialist scheme. Faceted science from the notation of each part (synthesised; faceted). classifications have become popular due to their flexibility Therefore, this study defines analytico-synthetic as sepa- and ease of development. Traditional schemes provide a rate from enumerative and faceted schemes. means of classifying universal knowledge, making them A review into why special subject-specific classifications time consuming to update and hard to specify. The crea- are built produced the following quote by Vickery (1960, tion of a specific classification scheme using faceted prin- 7), summing up the enduring reasoning behind the pro- ciples requires minimal time to build (in comparison to a cess: general revision) and will be adaptable to new astronomy topics. This research contributes to the literature in devel- Several reasons may be given why existing general oping the foundations of an astronomy classification using schemes are unsatisfactory. First, most of them do faceted classification principles. It is hoped that this initial not give adequate detail for accurate specification of study might lead to further development. the highly complex subjects in papers and reports that documentation must handle today. Second, de- 262 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

2.0 Methodology velopment is broken down into its constituent parts: fac- eted classification principles, classification design including 2.1 Research methods discipline, vocabulary, notation, examples, and provisional classification. The development highlights the applicability A qualitative methodology is employed in the form of his- of faceted principles to the discipline of astronomy and torical research and a case study, influencing and forming provides a means of further development after the con- the foundation of the faceted classification development. clusion of this study. The historical research into the development and use of astronomy notation within universal classification schemes 2.3 Ethical considerations is undertaken using thematic analysis of conference pro- ceedings and practical evaluation of primary classification This research is governed by the ethical code of conduct schedules. The conference proceedings were from the As- laid out by the British Sociological Association (2002) tronomical Society of the Pacific Conference Series, the statement of ethical practice and Aberystwyth University’s Library and Information Services in Astronomy (LISA) ethical guidelines. Any information of a sensitive nature is conferences, ranging from LISA I (1988) to VIII (2017), dealt with the necessary amount of data protection re- and the Astronomical Data Analysis Software and Systems quired for the information, as influenced by the BSA and (ADASS) conferences, ranging from ADASS XV (2005) to the CILIP (2015) code of professional practice. XXIII (2013). All articles were chosen based on their topic applicability within astronomy classification with a total of 3.0 Historical Research ten analysed thematically. Classification schemes offering astronomy notation are analysed within the context of 3.1 Background their suitability in providing detailed subject-specific nota- tion and this highlights the issues of astronomy notation Astronomy, an educational and research subject, has creation. The literature-based case study of INSPEC, a evolved slowly since its transformation in the middle ages, faceted/analytico-synthetic physics classification, involv- from a tool for creating calendars to understanding our ing the review and analysis of INSPEC schedules and sec- place in the solar system with Copernicus’ heliocentric ondary literature is undertaken to examine whether inter- model (Hoskin 2003). Diversification of astronomy came disciplinary subjects are suited to faceted library classifica- in the late twentieth century, where new technology and tions. The schemes notational qualities are analysed using observational techniques expanded the subject at an ever- an analytical framework based on Berwick Sayers’ (1955, increasing rate. Crovisier and Intner (1987) observed that 60) ideal qualities of classification schemes. The case study instead of being the tightly compacted and well-organised highlights the suitability of faceted classification principles science of the nineteenth century, astronomy grew into a being adopted in interdisciplinary subjects and provides a loosely defined and broad subject area. It is this change grounded classification example for the faceted astronomy that has enabled astronomy to outgrow the larger universal classification. The application of the research findings classification schemes and to warrant its own subject-spe- contributes to the development of the faceted astronomy cific scheme. classification. 3.2 Astronomy coverage in universal classification 2.2 Analysis of findings and classification development The LCC table for astronomy is represented by the begin- ning notation QB within class Q (science). Its structure is The findings of the historical research are analysed the- based on the original 1905 scheme with little change im- matically from the grouping of codes, comprising of the plemented through the six revisions up until the 1980s types of astronomy classification: LCC, DDC, UDC, De- (Crovisier and Intner 1987). The schedule’s development whirst, and astronomy classification tools, including IAU was based on the publication of new monographs, which Thesaurus, AIP Thesaurus, and the Unified Astronomy was reliant on research published in journal papers. It is Thesaurus (UAT). The findings of the case study are ana- this delay in revision which hinders the schedule’s ability to lysed thematically from Berwick Sayers ideals of classifica- be truly up-to-date and to fully integrate new subject de- tion. Both research methods are analysed separately, velopments. The enumerative qualities of this scheme providing layers of reasoning, supporting the purpose of were highly unsuited for the interdisciplinary subject in the this study in helping to form the basis of classification de- late 1980s, with the various areas of astronomy dispersed velopment. The final outcome of the study is the design amongst the other class Q tables such as physics and geol- of a provisional faceted astronomy classification. This de- ogy (Crovisier and Intner 1987). This concept scattering Knowl. Org. 46(2019)No.4 263 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme occurs in most enumerative schemes and is seen in the ex- tute of Astronomy at Cambridge University (Heck 2003). ample topic of “planetary geology.” In older versions of The scheme is based on the classification used within As- the LCC, works discussing the geology of the inner terres- tronomy & Astrophysics Abstracts; however, it was trial planets would sometimes be placed in QE (geology) adapted to the library collections. The DC varies from the instead of QB (astronomy) due to their emphasis on phys- universal schemes as the schedules for astronomy were re- ical terrestrial landforms (Crovisier and Intner 1987). As vised by a professional astronomer and librarian, thereby of modern-day, the QB table has been revised to include achieving the revision process attempted in UDC. The certain interdisciplinary subjects such as QB455-456 astro- scheme itself has been revised four times, the most recent geology. This is not a catch-all solution where interdiscipli- attempt in 2014. Its enumerative structure means that it is nary topics are classed in QB; its implementation is reliant hard to apply new interdisciplinary topics to the already on the librarian’s subject understanding and nature of col- full denominations. A lack of literature makes it hard to lections. assess its everyday practicality; however, its main fault is a The DDC schedule for astronomy is represented by the lack of brevity. beginning notation 520 within the 500 class (science). The The IAU Thesaurus was developed in 1984 as a means structure of the 520 table has changed very little in the last of creating standardised astronomical terminology for cat- twenty years as can be seen by comparing the twenty-first aloguing and was last revised between 1993 and 1995 edition to the WebDewey version (http://www.dewey. (Lesteven et al. 2007). A further revision was undertaken org/webdewey/standardSearch.html). The main change is in 2000 and resulted in the thesaurus’ evolution into the the addition of objects to existing subfields, i.e., 523.4 International Virtual Observatory Alliance Thesaurus Planets = 523.4 Planets, asteroids, trans-Neptunian objects (Frey et al. 2015). This version of the thesaurus is still cur- of the solar system. The issues that arose within the LCC rently used and, as of 2017, has 2,890 concepts (BARTOC. schedules are apparent in DDC and have mainly to do with org 2017). The abandonment of the IAU thesaurus led to the set structure of the enumerative classification type. the development of the UAT, which is a current astron- Rowley and Hartley (2008, 209) found that the application omy thesaurus that has real practical implications for as- of faceted principles to DDC has made the scheme much tronomy classification. more flexible in notation building. This has allowed for The UAT was developed to provide a free and commu- better interdisciplinary classification but has limited use nity supported astronomy and astrophysics vocabulary within the astronomy schedule due to its subject specificity. which could be used by the astronomy community in the The creation of UDC from DDC schedules, and subse- classification of journal articles and books (Accomazzi et quent faceted revisions, has provided a fully universal ana- al. 2014). The development of this thesaurus resulted from lytico-synthetic scheme with the UDC table for astronomy various outdated thesauri and vocabulary, such as the IAU represented by the beginning notation of 52. The scheme and PACS, that were present in astronomical and astro- underwent its first revision of the astronomy schedule in physical journals (Frey et al. 2015). The collaboration of 1975 and a secondary revision in the 1990s (Wilkins 1989; physicists and astronomers to produce a unified thesaurus 1995). It was found that sub-classes within the 52 class were guaranteed the investment and development required for outdated with some dropped for newer subjects. For exam- the thesaurus to be updated and used. The thesaurus can ple, in the 1975 revision, the 522/525/526 classes were can- be searched alphabetically or hierarchically, with each entry celled and reissued for new use, with major changes taking showing narrower, broader, and related terms. place for classes 520/521/523 and 524 (Wilkins 1989). Even The AIP Thesaurus is the remainder of the PACS. This though changes were made, the scheme quickly became out- classification tool was originally developed as a physics and dated again and the second revision focused on updating the astronomy classification scheme before it became un- whole discipline. Again, the focus was on classes 520 to 524, wieldy and was reduced to a more manageable thesaurus and this time Wilkins (1995) reported there was a call for (Access Innovations Inc. 2018). It is currently used to help astronomers, as experts, to help with the revision. The lack classify journal articles, but it is no longer maintained. This of expertise in schedule revision meant less was done to classification tool demonstrates the life cycle of independ- make the scheme truly suitable for astronomy classification, ent astronomy classification schemes and their inevitable even though it was used within astronomy libraries (Wilkins disappearance due to a lack of expert guided revisions and 1989). funding.

3.3 Specialist astronomy classification tools 3.4 Astronomy notation

The most well-known special astronomy classification, the The following classification from LCC, DDC, UDC, DC, Dewhirst classification (DC), was developed in the Insti- and UAT schemes showcase their current ability to provide 264 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

LCC: QB 843.S95 (The Library of Congress 2017) QB Astronomy 495-903 Descriptive Astronomy 799-903 Stars 843. Other particular types of stars, A-Z S95 Supernovae

Example 1. LCC.

DDC: 523.844 65 (WebDewey 2017) 520 Astronomy 523 Specific Celestial Bodies & Phenomena 523.8 Stars 523.84 Aggregations and variable stars 523.844 Variable stars 523.844 6 Eruptive variables 523.844 65 Supernovas

Example 2. DDC.

UDC: 524.352 (British Standards Institution 2005) 52 Astronomy. Astrophysics. Space Research. Geodesy 524 Stars. Stellar Systems. The Universe 524.3 Stars 524.35 Supernovae and related objects. Peculiar stars 524.352 Supernovae

Example 3. UDC.

DC: 122 (Institute of Astronomy 2017) Group XI. Stars and the Galaxy 122. Supernovae. Supernovae Remnants

Example 4. DC. notation for the complex subject area of supernovae/su- novae and supernovae remnants, even if they are detailed pernovae remnants. Green and Jones (2015, 355) describe under the same notation. These schemes have trouble di- a supernova as “an outburst in which a star suddenly in- viding the different elements of the object within a docu- creases in brightness by an enormous factor (~106). Such ment meaning that all supernovae are lumped together un- a star is ending its life in a gigantic explosion from the col- der one notational category. In the case of LCC and DC, lapse of its core.” There are two main supernovae types, the broader category is stars, whilst UDC and DDC are the “type I” and “type II,” with “type I” divided into three only schemes to provide a narrower category for superno- subtypes; Ia, Ib and Ic (Green and Jones 2015, 229). Any vae. There is no further subdivision of the subject area into classification scheme must be able to provide for complete types and all the schemes are unable to cater for the differ- specification of supernovae types (i.e., “type Ib”) or for ences between the process of producing supernovae and material on the general subject matter of which superno- their aftermath, i.e., supernovae remnants. The specificity vae is just one aspect (i.e., star evolution). that is lacking in these schemes can be achieved through The LCC notation places the object type under descrip- the application of further notation, feasibly in the form of tive astronomy, stars, and other types of stars whilst DDC key terms or facets. In comparison, the UAT has superno- puts it under specific celestial bodies and phenomena, vae as a narrower term for either stellar remnant or stellar stars, and variable objects. The UDC notation puts the ob- type, providing a choice of both categories. ject type under stars and then its own heading of superno- The term supernovae is divided into the objects’ main vae and related objects whilst DC puts it under stars and types allowing for specification. The related terms help us- the galaxy. DC is the only scheme to mention both super- ers searching within this subject area to find other related Knowl. Org. 46(2019)No.4 265 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

UAT: (Unified Astronomy Thesaurus no date)

Stellar Astronomy Supernovae

Broader Terms: Stellar Remnants - Stellar Types Narrower Terms: Core-collapse supernovae – Hypernovae - Type Ia supernovae Related Terms: burst astrophysics - ejecta - supernova dynamics - white dwarf stars

Example 5. UAT.

Z INTERDISCIPLINARY SUBJECTS ZM Astronomy and Astrophysics ZMAAAF Astronomy and astrophysics ZMBAAN Celestial mechanics ZMCAAW Theoretical astrophysics ZMEAAL Solar system ZMGAAB Stars ZMKAAZ Radio sources, infrared, x-ray and gammas-ray sources ZMMAAP Galaxies, stellar systems ZMRAAV Interstellar matter ZMTAAK Astronomical measurements listed by type of observation ZMVAAS Astronomical techniques and instrumentation

Example 6. INSPEC Interdisciplinary subject schedule (Field 1973). areas of research. Not all supernovae types are listed under lacking specificity as it is built around the collections rather the narrower terms, however, development of the thesau- than the discipline. It is with the development of a fully rus is still ongoing and this omission may be resolved in faceted scheme that the discipline of astronomy can find the future. The UAT displays the type of specificity which the specificity and flexibility needed to allow continuous an astronomy classification would need to provide broad revision without compromise to the whole scheme and classificatory assistance. complete control over notation building. Furthermore, the use of UAT, which has proven itself to be a flexible and 3.5 Findings reliable astronomical thesaurus, would provide sound de- scriptive elements and up-to-date terminology. The discourse on astronomy classification was limited as the main literature source came from the LISA conference 4.0 Case study: INSPEC proceedings with evidence found supporting the revision of the UDC and the UAT with the proviso to improve li- 4.1 Geophysics, astronomy, and astrophysics brary classification for astronomy collections. Crovisier and Intner (1987, 32), whose work on the revision of LCC The creation of the INSPEC Classification in 1969 revolu- is central to the historical review, argue that “In classifica- tionized the way STEM subjects were viewed in universal tion, as in mechanics, inertia is a powerful agent against classification schemes. These schemes realised the need to change.” This statement defines the evolution of astron- develop scientific areas to compete with the dominant hu- omy classification within universal classification schemes manities schedules. Originally INSPEC was created to pro- and their lack of consistent maintenance. These schemes vide journal classification enabling the searching of journal have attempted to introduce specificity within their topic articles with its sectional classification schedules. The con- headings with the implementation of subject revisions and cordance to the INSPEC Classification between 1969-1976 auxiliary tables, allowing for notation to be built with ad- (INSPEC 1976) provides a lack of evidence of notational ditional criteria; however, their main purpose is for classi- coding for the discipline of astronomy before 1973, with fying universal knowledge, thereby neglecting specificity. the first “schedule” dedicated to astronomy and astrophys- The only special astronomy classification that could be ics found within the 1973 INSPEC interdisciplinary sub- found currently employed was DC, yet even this scheme is jects schedule (Field 1973, 51). 266 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

H1 – ZAAAAZ Interdisciplinary Subjects H2 – ZMAAAF Astronomy and Astrophysics H3 – ZMGAAB Stars H4 – ZMGGAJ Specific stellar objects H5 – ZMGGGX Supernovae and supernovae remnants

Example 7. Supernovae classification (INSPEC 1973).

A PHYSICS A90 GEOPHYSICS, ASTRONOMY AND ASTROPHYSICS A91 Solid Earth physics A92 Hydrospheric and lower atmospheric physics A93 Geophysical observations, instrumentation, and techniques A94 Aeronomy, space physics, and cosmic rays A95 Fundamental astronomy and astrophysics, instrumentation and techniques and astronomical observations A96 Solar system A97 Stars A98 Stellar systems; Galactic and extragalactic objects and systems; Universe

Example 8. Geophysics, astronomy, and astrophysics (INSPEC 2004).

SC (H1) – A Physics H1 (H2) – A9000 Geophysics, astronomy and astrophysics H2 (H3) – A9700 Stars H3 (H4) – A9760 Late stages of stellar evolution H4 (H5) – A9760B Supernovae

Example 9. Supernovae classification (INSPEC 2004).

Within the initial schedules for the unified INSPEC Classi- 1978 schedules (INSPEC 1999; Institution of Electrical fication, there was a limit of five hierarchical levels, to pre- Engineers 1988; 1982; 1981; 1978). vent excessive complexity. Each hierarchical level could The schedule retains five levels of hierarchy; the first contain more than ten subdivisions; so, to mitigate against level is now the section code letter and the four numbers loss of order each subdivision was allocated a six-letter plus end letter replaces hierarchy levels two to five. It fol- code (Field 1973, viii). The sequence relied upon the num- lows that the more specific the number the narrower the ber of As to ascertain the hierarchical level, with no As in search becomes, with the narrowest hierarchy containing a the sequence indicating the narrowest hierarchy and four letter suffix to finish the notation (INSPEC 2004, v), as As representing the broadest hierarchy, as seen in Example seen in Example 9. 7. The last letter in the sequence was always a check num- The classification notation changed dramatically over a ber (Field 1973, viii). thirty-year period with fundamental changes to the hierar- The notation applied to the schedule was impractical chical structure of the astronomy and astrophysics sched- for subject collocation. The astronomy schedules were dis- ule. Instead of being placed under “stars and specific stel- placed from the physics schedules (A-R), and was placed lar objects” (Example 7), supernovae and supernovae rem- at the end of the notational sequence under Z. Eventually, nants were split into subsections for “stars and late stages the complexity of applying notation and astronomy’s in- of stellar evolution” (Example 9) and “stellar systems; ga- creasing use as a developing discipline area saw it revised lactic and extragalactic objects and systems; universe and into a mainstream schedule within the physics section by interstellar medium; nebulae” (Example 10). 1978. The newest physical version of the schedule was re- Using the example of supernovae and supernovae rem- leased in 2004 with the inclusion of geophysics, as seen in nants, it can be seen over time that the INSPEC schedules Example 8 (INSPEC 2004). The same version of the IN- have tried to cater towards discipline developments. The SPEC schedule can be found in 1995 (INSPEC 1995) and scheme has been adapted to consider the context in which 1999, and with minor variations in the 1988, 1981-2, and supernovae are studied either within the context of stellar Knowl. Org. 46(2019)No.4 267 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

SC(H1) – A Physics H2 – A9000 Geophysics, astronomy and astrophysics H3 – A9800 Stellar systems; Galactic and extragalactic objects and systems; Universe H4 – A9840 Interstellar medium; nebulae H5 – A9840N Supernova remnants

Example 10. Supernova remnants (INSPEC 2004).

Expressiveness The basic hierarchical structure facilitates narrow and broad searching. Subsections display fewer relationships between topics on the same level, but a general order is established based on prominence within the subject area. Could be improved by reworking subsections into a specific order relevant to the context of the subject area.

Mnemonic Follows set notational rules. The index provides access to subject notation making it easy for the classifier to find specific subject notations. Could improve memorability by providing shorter mixed notation, including elements of literal mnemonics.

Simplicity Lacks the simplicity of notation needed for ease of retrieval in physical library environments, but notation has become more simplistic and easier to understand. Based on a hierarchy, it is appropriate when determining collocation on shelves or in databases, but could be adapted to reduce notation length.

Uniqueness Uniqueness for each hierarchal level of the subject. Notation could be improved by providing shorter notation for general subjects and longer notation for specific subjects, thus providing clear distinction between levels of notation and easing confusion when applying notation.

Brevity Mixed notation of letters and numbers. A maximum of two letters and four numbers used in a notation, with letters providing the first and last points in the notational sequence. Provides a huge array of notational fields and notational flexibility but could be improved by applying shorter notation for general subject areas.

Flexibility Provides many subject areas in which to classify from and specific subjects can be chosen without impacting on the rest of the scheme. Building notation cannot be achieved, leading to less flexibility within individual library settings for specific or interdisciplinary subjects.

Hospitality Can be easily expanded for the inclusion of new subject areas, object types, and phenomena. Extension of notation could be undertaken under each main sub-heading as they are yet to fill all the subfields, and also under the main heading A99. However, expansion as with enumerative schemes may disrupt the order of the scheme.

Notational Lacks comprehensive faceted auxiliary tables allowing for flexible notation building. INSPEC is set rather than structure flexible and subject coverage is wide-ranging although rather imprecise. Provides for 2,174 notational fields within the physics section alone, with 389 of those applying to the geophysics, astronomy, and astrophysics schedule (INSPEC 2004). There is still room for expansion, as many notational fields have yet to be created and used. Notational structure of this scheme could be improved.

Table 1. Classification qualities of INSPEC and improvements. evolution (Example 9) or as a by-product of stellar evolu- played by INSPEC facilitates thoughts on classification tion and their place within the universe (Example 10). The principles that could improve its functionality. The quali- order of the subsections was rearranged into a more user- ties are based on Berwick Sayers’ (1955, 60) ideal qualities friendly sequence and notation changed to fit in with the of a classification scheme (brief, simple, and flexible): ex- other main classes. This contextual flexibility is not seen in pressiveness, mnemonic, simplicity, uniqueness, brevity, the universal schemes. flexibility, and hospitability (see Table 1).

4.2 Classification qualities 4.3 Findings

The qualities INSPEC displays are typical of an analytico- The INSPEC Classification has provided evidence that faceted synthetic scheme: being hierarchical in nature, providing principles can help serve the discipline of astronomy and has set notation as well as displaying faceted principles, where facilitated analysis of faceted notational improvements, which subject areas are contextually linked and specific, including could increase the scheme’s quality. The case study has high- an index of specific subjects. Outlining the qualities dis- lighted the importance of hierarchy and providing multiple 268 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme classifying options for contextual flexibility. The notational el- 5.0 Development of a faceted astronomy ement itself is concise but complicated. An adaptation of the classification notation by removing unnecessary length in general subject areas would allow for ease of application and retrieval. A so- 5.1 Principles of faceted classification lution providing for collocation, ease of retrieval and applica- tion, and brevity and flexibility could be to include a separate The main object of faceted classification is facet analysis. auxiliary table containing key terms and astronomical objects, Vickery (1960, 13) describes three main steps of facet anal- enabling tagging to take place. The tags would then provide ysis in the construction of faceted classification schemes: searchable access points to other resources. Although rela- tively unusual, this would allow for notation to stay simple, (i) to assign an order in which the facets will be used flexible, and hospitable. Recent research has focused on de- in constructing compound subject headings, veloping tagging systems to improve searching (see, for exam- (ii) to fit the schedules with a notation which permits ple, Mendes et al 2009). Lawson (2009) reported on improv- the fully flexible combination of terms that is ing the library catalogue through the inclusion of tags. There needed that is needed and which throws subjects also has been promising work on the potential of social tag- into a preferred filing order, and ging to enhance traditional subject cataloguing; Pera, Lund, (iii) to use the faceted scheme in such a way that both and Ng (2009) developed EnLiS, a library system that im- specific reference and the required degree of generic proves user searches by using folksonomies to perform simi- survey are possible. larity matches between keywords in the query and user tags from LibraryThing improving the relevance of results). Fi- It begins with the creation of facets, the analysis of spe- nally, Hedden (2008), who advocates the use of semantic tag- cific aspects of a subject. Ranganathan developed five fac- ging that links tags into meaningful taxonomies), so incorpo- ets, each to be applied within the stated order: 1) person- rating tagging would not be out of line with current and de- ality; 2) matter; 3) energy; 4) space; and, 5) time, also veloping practices. More traditionally, the application of no- known as PMEST (Broughton 2015). Personality describes tation building features would provide specific notations al- the specific subject within the item to be classified; matter lowing for subject complexity. the properties or material of the subject matter; energy the Broughton (2006; 2015) has argued of the importance of processes of the subject; space the positioning or location faceted classification by supporting its development and of the subject; and, time the date of the subject matter. heralding it as the future of information retrieval, and Vickery (1960, 30) adapted Raganathan’s facets into ten Kumbhar (2012, 11) advises that the best way to organise new facets; 1) substance, product, organism; 2) part, organ, bibliographic material is through a faceted classification structure; 3) constituent; 4) property and measure; 5) ob- scheme based on scientific principles. Other studies of fac- ject of action, raw material; 6) action, operation, process, eted classification in information retrieval include Mills’ behaviour; 7) agent, tool; 8) general property, process, op- (2004) Library Trends paper on faceted classification and log- eration; 9) space; and, 10) time. Facets can be specific or ical division in information retrieval, La Barre’s (2006) PhD broad, simple or complex, and can cover almost any aspect thesis, Gnoli and Mei’s (2006) study of freely faceted classi- of a subject. fication for web based retrieval, Uddin and Janacek’s (2007) Once facets have been chosen for a subject area they development of a multidimensional classification system in need to be split into foci, the second level of subject anal- the web that can provide an alternative but convenient struc- ysis and isolates, the third level of subject analysis, to get ture for organising and finding information content, and the basic structure of the scheme. Foci are aspects of a Tunkelang’s (2009) lecture on using faceted search to pro- facet and isolates are grouped into foci. The foci are then vide more effective information seeking support to users in ordered into a suitable arrangement for the subject area, online search systems. It is with this evidence base in mind, known as order in array (Rowley and Farrow 2000; Rowley and the application of the notational improvements above and Hartley 2008, 182). Batley (2005, 122) lists four possi- and use of UAT vocabulary that development of a new fac- ble arrangements of foci: logical, procedural, chronologi- eted astronomy classification is undertaken. cal, and alphabetical. Ranganathan (1937, 42-43) suggested any order based on systematic principles, such as quantita- tive, developmental, spatial or time (evolutionary), and ca- nonical. It can be supposed that the arrangement of foci within the facets will be dependent on the main use and subject of the scheme. Once the basic structure of the scheme is developed, the schedule order and notation is established. Schedule Knowl. Org. 46(2019)No.4 269 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme order can either be specific before general, or inverted, so 5.2.1 Subject schedules for astronomy general subject areas come before specific topics, known as the principle of inversion (Rowley and Farrow 2000, Astronomy is a broad and complex interdisciplinary sub- 204). Ranganathan proposed the use of inverted order for ject. To understand its complexities, a list of the most com- schedules and notation in faceted classifications. To create mon subject areas within taught astronomy was drawn notation flexibility, subdivisions from different facets may from current textbooks (see Vickery 1960, 20, for more on be joined together to create unique call numbers. This pro- this methodological step). Five textbooks demonstrating cess involves applying a syntax to develop order within the universal astronomy topics were chosen and their table of arrangement of facets, helping to aid retrieval. The use of contents analysed for key and repetitive subject areas. The punctuation symbols by Ranaganathan in the CC (, per- chapter introductions and summaries were discarded to sonality - ; matter - : energy - . space - ‘ time) provided a keep the topic investigation clear. Table 3 shows the main syntax that allowed compound subjects to be constructed topic areas found. and facets to be linked together. The citation order of the The topic investigation also found a new classification facets is inverted with personality, matter, and energy in- system created by Dick (2013), demonstrating a kingdom, creasing in specificity and filed before the common facets family, and class classification structure for the classifica- space and time (Sayers [1926] 1975, 62-63). This study con- tion of astronomy objects, based on hierarchical science structs a faceted astronomy classification using these clas- kingdom classification systems. The three main kingdoms sification principles. are: planets, stars, and galaxies. The kingdoms are arranged hierarchically, increasing in size and complexity. Compar- 5.2 Classification design ing the hierarchical kingdom classification with the topic investigation results found an overlap in subject matter. The first task is to create the relationship facets used to Grouping the main topic areas within astronomical text- achieve the specifics of notation building. These facets are books with Dick’s classification structure of astronomy the “add-on” themes that can be applied to most subject objects produced four main subject areas, and examining areas. The facets used in the development of this scheme current schedules for DDC, UDC, and INSPEC produced are influenced by Ranganathan’s PMEST and reflect the two minor subject areas (see Table 4 below). complex range of information in astronomy classification The main subject divisions increase in evolutionary (see Table 2 below). scale from localized phenomena and objects to the origins Each relationship facet will have its own auxiliary table and evolution of the universe. The minor subject divisions providing coded notation for the terms, objects, and dates. cover the techniques and equipment used in astronomical The set facet is the beginning notation allowing for the di- observation. These six subject areas form the knowledge vision of material types. The remaining facets are added divisions of astronomy and each will form their own sub- after the subject schedule notation. ject schedule. In addition to building classification from

The item’s physical state, e.g., book (BK). Used for every classification to allow for filing Set Facet Material Type order by material type.

The individual astronomical object. Comprised of astronomical object catalogues such as Facet Object Messier and planet/object type abbreviations, e.g., M13 (Globular Cluster).

The active process present in the work/object. Comprised of physical processes from Facet Process physical sciences, e.g. CYV (Cryovolcanism).

The apparent location of an astronomical object, e.g. M13 – 16h41m 41.6s +36d27m41s. Common Facet Location Comprised of a list of astronomical right ascension and declination measurements based on the J2000 equatorial coordinate system numerically coded to produce shorter notation.

The year of observation, discovery, or first publication, e.g. Discovery: 1596. Location and Common Facet Time time, will only be used to distinguish between other works with the same notation or for specificity.

Table 2. Relationship facets. 270 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

Solar System Sun and Stars Terrestrial Planets Sun: surface, interior, and atmosphere Giant Planets (Gas and Ice) Stars: measurement and observations Asteroids, Kuiper Belt, and Comets Stellar Formation Planetary formation Main Sequence Life Geological and surface processes Stellar Death Atmospheric processes Stellar Remnants

(McBride and Gilmour 2004) (Green and Jones 2015)

Galaxies and Cosmology Planetary Science Milky Way composition Solar System: composition, formation, and dynamics Galaxy: classification, formation, and evolution Solar: energy transfer, atmosphere, and surface Normal Galaxies Planetary: atmospheres, surfaces, and interiors Active Galaxies Magnetic Fields Galaxy distribution Other Solar System Bodies: meteorites, minor planets, comets, and planetary rings Universe: evolution, measurement, and problems Extrasolar Planets Planetary Formation (Jones, Lambourne and Serjeant 2015) (de Pater and Lissauer 2015)

Universe Astronomy: observation and equipment Planets and Moons: terrestrial, gas, and ice Stars: formation, evolution, death, and oddities Galaxies: Milky way, normal, and active Cosmology: origins, evolutional, and SETI

(Freedman and Kaufmann 2016)

Table 3. Main topic areas within astronomy textbooks.

Main Divisions Planetary Systems Stellar Systems Galactic Systems Cosmology

Minor Divisions Astronomical Techniques Astronomical Equipment

Table 4. Main and minor divisions. the subject schedules and relationship facets, the scheme of ten hierarchical levels and eleven top level concepts (Uni- should ideally cater for subject searching via controlled fied Astronomy Thesaurus 2015). The current structure of subject headings. the thesaurus has changed in the intervening period with the terminology being equally distributed over the top-level 5.2.2 Unified vocabulary concepts. The release of UAT v.2.0.0 in early 2017 cleaned up subject duplications and provided sixteen new terms The UAT, a free resource delivering internationally unified (Unified Astronomy Thesaurus 2017). This resource is nec- astronomical terminology, is constantly revised by scientists essary for this scheme’s interoperability. and astronomers providing unified subject headings. The use of UAT subject headings in this classification scheme 5.2.3 Notation building allows for scientific accuracy within subject areas and nar- rower and broader searching. As of 2015, the UAT had Literal mnemonics will be used to satisfy Berwick Sayer’s 1,906 terms, a range of twelve hierarchical levels and fifteen ideal notational qualities, and each main and minor division top level concepts (Frey et al. 2015). On the release of UAT will start with a defining letter code that is representative v.1, the scheme was updated to include 1,834 terms, a range of the subject area (see Table 5 below). Knowl. Org. 46(2019)No.4 271 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

The main and minor divisions are subdivided three are filed before specific facets; the order being material times to produce a sequence of three numbers, each with type, object, process, location, and time. The facet for ma- its own hierarchy. Three subdivisions enable a certain level terial type is crucial for the effective filing and retrieval of of subject specificity which can be further quantified using an item within a physical or online library environment and the relationship facets. This produces a simple and memo- is added to the beginning of the call number. The other rable three-digit code at its maximum, which is then ap- relationship facets are added after the subject notation plied to the subject letter code, e.g. P111. The ending no- based on increasing specificity. The use of mixed notation tation is based on the specificity of the work and the ap- in the creation of a call number allows for Berwick Sayers plication of the relationship facets. The relationship facet ideals of classification to be upheld and for this scheme to notation varies between letter suffixes and numerical codes have a modern classification approach. As will be seen in providing unique notation in most cases. Material type is the examples below, the scheme’s main disadvantage is that denoted by a two-letter abbreviation; object by its cata- compound subjects can create complex notation making logue code or three-letter abbreviation; process by a three- shelf arrangement challenging; though the application of letter abbreviation; location by a two-digit code; and, time complex notation is decided by the classifier and can, by its four-number year notation (see 5.3 Provisional as- therefore, be negated. tronomy classification for details). The application of these facets is dependent on the specificity of the work 5.2.4 Examples of classification and material type; however, simplicity and uniqueness can still prevail by using only the material type abbreviation, Reviewed below is an example outline of a main division code and three-digit subject code. knowledge division in the subject schedules and an inter- To synthesise the facets together, a syntax of notational pretation of classifying a book, paper, and observation us- punctuation will be used. The notational punctuation used ing the basic schedules and relationship facets. They dis- to display the faceted relationships are built from mathe- play the usability and functionality of the classification matical and literary punctuation and have been influenced scheme for different material types and subject areas. The by Ranganathan’s work (see Table 6 below). schedule order is hierarchical, based on Dick’s (2013) king- The schedule order is based on Ranganathan’s inverted dom, family, and class classification and then further or- principle for faceted classification, whereby general facets dered spatially if applicable, e.g. solar system planets.

Main Divisions P Planetary Systems S Stellar Systems G Galactic Systems C Cosmology

Minor Divisions AT Astronomical Techniques AE Astronomical Equipment Table 5. Coded main and minor divisions.

Facet Relationship Punctuation / material type relationship ; object relationship ‘ process relationship , location relationship . time relationship

Subject Relationship Punctuation = equal knowledge of two or more subject areas + more than equal knowledge of two or more subject areas - less than equal knowledge of two or more subject areas : astronomical technique or equipment Table 6. Punctuation for facets and subject relationships. 272 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

Main Division P Planetary Systems Subdivision of Main Division P1 Solar System Second Subdivision of Main Division P11 Planetary Types Third Subdivision of Main Division P111 Terrestrial Planets

Example 11. Main knowledge division.

Planet Mercury: From Pale Pink Dot to Dynamic World, by David A. Rothery.

BK P Planetary Systems P1 Solar System P11 Planetary Types P111 Terrestrial Planets

Relationship Facets Object: Mercury  ;MER Process: Geology  ‘GEO

Notation: BK/P111;MER’GEO

Example 12. Class mark for a book.

Cepheid Variables in the Flared Outer Disk of our Galaxy, by Michael W. Feast, John W. Menzies, Noriyuki Matsunaga and Patricia A. Whitelock. Nature 2014:509(7500): 342.

JA S Stellar Systems S3 Variable Stars S31 Intrinsic Variables S311 Pulsating Variables

G Galactic Systems G5 Milky Way

Relationship Facets Object: Cepheid Variables  ;CEV

Notation: JA/S311=G5;CEV

Example 13. Class mark for an article.

2015 Radio observation of spiral galaxy M61

DP G Galactic Systems G1 Galaxies G11 Galaxy Types G113 Spiral Galaxies

AT Astronomical Techniques AT1 Observation Methods AT13 Radio

Relationship Facets Object: M61  ;M61 Time: 2015 .2015

Notation: DP/G113:AT13;M61.2015 Example 14. Class mark for a physical dataset: observation.

Knowl. Org. 46(2019)No.4 273 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

5.3 Provisional astronomy classification P13 Solar System Objects P131 Dwarf Planets Provisional Astronomy Classification Subject Schedules P132 Meteoroids Subject Schedules – Planetary Systems (P), Stellar Systems (S), P133 Asteroids Galactic Systems (G), Cosmology (C), Astronomical Techniques P134 Comets (AT), Astronomical Equipment (AE). P135 Trans-Neptunian Objects P14 Solar System Regions Facets – Material Type, Object, Process, Location, Time. P141 Asteroid Belt P142 Kuiper Belt Material Type – Book (BK), Journal Article (JA), Sky Atlas’ P143 Oort Cloud (SA), Dataset – Physical (DP), Dataset – Digital (DD), Photo- P2 Extrasolar Planets graphic Plates (PP), Equipment (EQ). P3 Planetary Formation P31 Protoplanetary Disk Object – Messier catalogue, Planets, Moons, Stellar Types, Ga- P32 Planetary Collisions lactic Types. P33 Planetary Migration P4 Interplanetary Medium Features Process – Planets: Formation Processes, Surface and Interior P41 Gas Geology, Atmospheric Processes. Stars: Formation Processes, P42 Dust Core Reactions, Atmospheric Processes. Galaxies: Formation P43 Solar Wind Processes. P44 Cosmic Rays P5 Astrobiology Location – RA & Dec Measurements: Hemisphere (Dec, RA) and Southern Hemisphere (Dec, RA). S Stellar Systems S1 Stellar Sequence Time – Date Ranges: BC, 0-500, 501-1000, 1001-1500, 1501- S11 Pre-Main Sequence Stars 2000, 2001-2500. S112 Protostars S12 Main Sequence Stars Facet Relationship Punctuation S121 O Class / material type relationship S122 B Class ; object relationship S123 A Class ‘ process relationship S124 F Class , location relationship S125 G Class . t ime relationship S124 K Class S125 M Class Subject Relationship Punctuation S13 Post-Main Sequence Stars = equal knowledge of two or more subject areas S131 Subgiant Class + more than equal knowledge of two or more subject areas S132 Giant Class - less than equal knowledge of two or more subject areas S133 Bright Giant Class : astronomical technique or equipment S134 Supergiant Class S135 Hypergiant Class Basic Draft Schedules S14 Stellar Evolution-Death S141 Supernovas P Planetary Systems S142 Novae P1 Solar System S143 White Dwarfs P11 Planetary Types S144 Neutron Stars P111 Terrestrial Planets S145 Black Holes P112 Gas Giant Planets S146 Planetary Nebula P113 Ice Giant Planets S15 Stellar Remnants P12 Planetary Features S151 Supernova Remnants P121 Moons/Satellites S152 Nova Remnants P122 Rings S153 Planetary Nebula Remnants P123 Radiation Belts S2 Multiple Star Systems S21 Binary Stars 274 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

S211 Brown Dwarfs G23 Galactic Clusters S22 Multiple Stars G24 Galactic Superclusters S23 OB Associations G3 Galaxy Formation S24 Stellar Clusters G31 Galaxy Mergers S241 Open Clusters G4 Intergalactic Medium Features S242 Globular Clusters G41 Gas S3 Variable Stars G42 Dust S31 Intrinsic Variables G43 Galactic Wind S311 Pulsating Variables G44 Extragalactic Cosmic Rays S312 Eruptive Variables G5 Milky Way S313 Cataclysmic Variables S32 Extrinsic Variables C Cosmology S321 Rotating Variables C1 The Universe S322 Eclipsing Binaries C11 Origins S323 Planetary Transits C12 Evolution S4 Exotic Stars C13 Age S41 Quark Stars C131 Hubble Time S42 Boson Stars C14 Temperature S43 Electroweak Stars C141 Black-body Spectrum S44 Preon Stars C2 Cosmic Matter S45 Planck Stars C21 Baryonic Matter S5 Interstellar Medium Features C211 Hydrogen Plasma S51 Dust C212 Helium Plasma S52 Gas C22 Dark Matter S521 H I Cloud C221 Baryonic Dark Matter S522 H II Cloud C222 Non-baryonic Dark Matter S523 H 2 Cloud C23 Cosmic Density/Energy S53 Stellar Wind C231 Dark Energy S54 Galactic Cosmic Rays C232 Dark Energy Density C3 Cosmic Radiation G Galactic Systems C31 Electromagnetic Radiation G1 Galaxies C311 Radio waves G11 Galaxy Types C312 Microwaves G111 Elliptical Galaxies C313 Infrared Radiation G112 Lenticular Galaxies C314 Visible Light G113 Spiral Galaxies C315 Ultraviolet Radiation G114 Irregular Galaxies C316 X-Rays G115 Barred Galaxies C317 Gamma Rays G116 Dwarf Galaxies C32 Cosmic Microwave Background G12 Active Galaxies C33 Cosmic Background Radiation G121 Seyfert Galaxies C4 Cosmic Expansion G122 Radio Galaxies C41 Hubble’s Law G123 Quasars C411 Hubble Flow G124 Blazars C412 Hubble Constant G13 Galactic Features C413 Hubble Parameter G131 Galactic Ring C42 Redshift G132 Galactic Accretion Disk C43 Blueshift G133 Galactic Jets C5 Cosmological Models G134 Galactic Halos C51 Special Relativity G2 Multiple Galaxy Systems C511 Space-Time G21 Binary Galaxies C512 Time Dilation G22 Interacting Galaxies C52 General Relativity C521 Curvature of Space-Time Knowl. Org. 46(2019)No.4 275 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

C522 Einstein’s Field Equations 5.4 Proposed use of classification scheme C53 Cosmological Principle (Uniformity) C54 Einstein Model The scheme’s author has envisaged the classification scheme G55 de Sitter Model in an accessible online format, whereby institutions hold an G56 FRW Models online subscription much like WebDewey and LCC’s Classi- G57 Einstein-de Sitter Model fication Web. Each subject schedule could be searched hier- G58 Eddington-Lemaître Model archically through an expandable linked list, thereby making G59 Lemaître Model it easy to go from the broadest term to the narrowest whilst C6 Cosmological Theories retaining the steps taken. Ideally, the scheme would allow for C21 Big Bang a split screen mode, meaning classifiers could search as C22 Big Crunch many subject schedules as needed to find the subject codes C23 Multiverses for the work. This would be advantageous not only for com- C24 Rebound Theory parison of subjects for non-experts but also for applying C25 String Theory subject codes to interdisciplinary works. The relationship C7 SETI facets (material type, object, process, location, time) would have their own tabs under set facet, facet, and common AT Astronomical Techniques facet, with an expansion list of coded notation in numerical AT1 Observation Methods or alphabetical order. Furthermore, a keyword search func- AT11 Photometry tion on each page would provide intuitive manipulation of AT12 Spectroscopy the lists for ease of searching. A feature which could be in- AT13 Radio tegrated into the online scheme is of a classification checker, AT14 Infrared whereby the classifier could input the main classificatory AT15 Gamma Ray features of an item (drop down lists provided) into a “clas- AT16 X-Ray sification calculator” (see mock up below). The scheme AT17 Microwave would then automatically arrange the information based on AT18 Ultraviolet the classifications rules and add in relevant relationship AT2 Distance Measurements punctuation to form a list of feasible class marks for the AT21 Trigonometric Parallax item. The classifier would then be able to double check the AT22 Spectroscopic Parallax class marks and ascertain their suitability within their collec- AT23 Doppler Shift tion. Other tabs in the online format would hold key infor- AT24 Hubble’s Law mation on how to build and check a class mark as well as AT3 Luminosity Measurements information on improvements and revised subject sched- AT31 Apparent Magnitude ules. Feedback through an online form, as well as an “ask a AT32 Absolute Magnitude librarian” feature would enable classifiers to directly give feedback to the system’s creators, allowing for developments AE Astronomical Equipment to occur quicker than in other traditional classification AE1 Observatories schemes. AE11 Earth Observatories AE12 Space Observatories 6.0 Concluding comments AE2 Telescopes AE21 Optical Telescopes The discipline of astronomy requires its own classification AE211 Refracting Telescope scheme for several reasons. Firstly, the provision of astron- AE212 Reflecting Telescope omy schedules within current universal schemes lacks the AE213 Catadioptric Telescope specificity for complex subject classification. Secondly, spe- AE22 Non-Optical Telescopes cial classification schemes built for astronomy have either AE221 Radio Telescope been merged with physics-based schemes or are too simple AE222 X-Ray Telescope to provide useful notation in environments with multiple AE223 Gamma Ray Telescope material types. And thirdly, the lack of flexibility within these AE224 Infrared Telescope schemes means there is not a comprehensive interdiscipli- AE225 Ultraviolet Telescope nary scheme for use in astronomical libraries. Developing a special classification scheme on faceted classification princi- ples provides a specific and flexible classification catering to the discipline’s interdisciplinary nature. Adding unified ter- 276 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

Subject 1: ______Material Type: ______Process: ______Subject 2: ______Object 1: ______Location: ______Subject 3: ______Object 2: ______Time: ______[Search Classification] Class mark 1: Class mark 2:

Example 15. Mock up of “classification calculator” for astronomy classification. minology in the form of broader and narrower terms https://www.britsoc.co.uk/equality-diversity/statement- through UAT enhances the usability of a new scheme within of-ethical-practice/ astronomy libraries. British Standards Institution. 2005. Systematic Tables. Vol. 1 This study has provided the means of investigating an of UDC, Universal Decimal Classification. 3rd ed. London: overlooked discipline area within librarianship discourse British Standards Institution. and has provided an insight into astronomy classification Broughton, Vanda. 2006. “The Need for a Faceted Classifi- and its relationship within classification types. Understand- cation as the Basis of all Methods of Information Re- ing the nature of the discipline of astronomy and how it trieval.” Aslib Proceedings 58, nos. 1/2: 49-72. doi: 10.11 can be provided for within library classification schemes is 08/00012530610648671 key to the continuation and future development of astron- Broughton, Vanda. 2015. Essential Classification. 2nd ed. omy classification for specialist collections. Further re- London: Facet. search that could be undertaken consists of revising, fin- Chowdhury, G.G. 2010. Introduction to Modern Information ishing, and testing the classification development within a Retrieval. 3rd ed. London: Facet. physical library setting, where its flexibility and functional- Chowdhury, G.G. and Sudatta Chowdhury. 2007. Organiz- ity can be analysed and the scheme improved. ing Information: From the Shelf to the Web. London: Facet. CILIP (Chartered Institute of Library and Information Pro- References fessionals). 2015. “Code of Professional Practice.” https://www.cilip.org.uk/about/ethics/code-professio Access Innovations Inc. 2018. “Access Innovations, Inc. nal-practice and American Institute of Physics “Unravel” New The- Corbin, Brenda. 2003. “The Evolution and Role of the As- saurus for Online Scholarly Publications,” https://www. tronomical Library and Librarian.” In Information Han- accessinn.com/access-innovations-inc-and-american-in dling in Astronomy: Historical Vistas, ed. A. Heck. Astro- stitute-of-physics-unravel-new-thesaurus-for-online physics and Space Science Library 285. New York: -scholarly-publications/ Kluwer Academic, 139-155. Accomazzi, Alberto, Norman Gray, Chris Erdmann, Chris Crovisier, Ronald and Sheila S. Intner. 1987. “Classifica- Biemesderfer, Katie Frey, and Justin Soles. 2013. “The tion for Astronomy: The QB Schedule of the Library Unified Astronomy Thesaurus.” In Astronomical Data of Congress Classification.” Cataloguing & Classification Analysis Software and Systems XXIII: Proceedings of a Con- Quarterly 7, no. 3: 23-36. doi: 10.1300/J104v07n03_04 ference held at Waikoloa Beach Marriott, Hawaii, USA. 29 De Pater, Imke and Jack J. Lissauer. 2015. Planetary Sciences. September-3 October, 2013, ed. N. Manset, P. Forshay. ASP 2nd ed. Cambridge: Cambridge University Press. Conference Series 485. San Francisco: Astronomical Dick, Steven J. 2013. Discovery and Classification in Astronomy: Society of the Pacific, 461-64. Controversy and Consensus. Cambridge: Cambridge Uni- BARTOC (Basel Register of Thesauri, Ontologies & Clas- versity Press. sifications). 2017. “International Virtual Observatory Field, Barry James. 1973. INSPEC Classification: A Classifi- Alliance Thesaurus,” https://bartoc.org/en/node/381 cation Scheme for Physics, Electrotechnology, Computers and Batley, Susan. 2005. Classification in Theory and Practice. Ox- Control. London: Institution of Electrical Engineers. ford: Chandos. Freedman, Roger and William J. Kaufmann. 2016. Universe. Berwick Sayers and William Chadwick. 1955 A Manual of 10th ed. New York: W.H. Freeman. Classification for Librarians and Bibliographers. 3rd ed. Lon- Frey, Katie, Christopher Erdmann, and Alberto Ac- don: Grafton. comazzi. 2015. “Management of the Unified Astron- British Sociological Association. 2002. “Statement of Ethi- omy Thesaurus.” In Library and Information Services in As- cal Practice for the British Sociological Association.” tronomy VII, Open Science at the Frontiers of Librarianship: Proceedings of a Conference held at Astronomical Observatory Knowl. Org. 46(2019)No.4 277 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

of Capodimonte, Naples, Italy, 17-20 June 2014, ed. András puters, and Control. [London]: Institution of Electrical En- Holl, Dianne Dietrich and Antonella Gasperini. ASP gineers. Conference Series, 492. San Francisco: Astronomical Institution of Electrical Engineers. 1978. INSPEC Classifica- Society of the Pacific, 204-7. tion: A Classification Scheme for Physics, Electrotechnology, Com- Gnoli, Claudio and Hong Mei. 2006. “Freely Faceted Clas- puters, and Control. London: Institution of Electrical En- sification for Web-based Information Retrieval.” New gineers. Review of Hypermedia and Multimedia 12, no. 1: 63-81. Jones, Mark H., Robert J.A. Lambourne and Stephen Ser- Green, Simon and Mark Jones, 2015. An Introduction to the jeant, eds. 2015. An Introduction to Galaxies and Cosmology. Sun and Stars. 2nd ed. Cambridge: Cambridge University 2nd ed. Cambridge: Cambridge University Press. Press. Kumbhar, Rajendra. 2012. Library Classification Trends in the Heck, Andre, 2003. Information Handling in Astronomy: His- 21st Century. Oxford: Chandos Publishing. torical Vistas. Astrophysics and Space Science Library, 285. La Barre, Kathryn. 2006. The Use of Faceted Analytico- New York: Kluwer Academic Publishers. Synthetic Theory as Revealed in the Practice of Website Hedden, Heather. 2008. “How Semantic Tagging In- Construction and Design. PhD diss., Indiana University. creases Findability.” EContent 31, no. 8: 38-43. Lawson, Karen G. 2009. “Mining Social Tagging Data for Herner, Saul and Robert S. Meyer. 1957. “Classifying and Enhanced Subject Access for Readers and Research- Indexing for the Special Library.” Science 125, no. 3252: ers.” The Journal of Academic Librarianship 35: 574-82. 799-803. DOI: 10.1126/science.125.3252.799 Library of Congress. 2017. “Q Text,” Library of Congress Hider, Philip and Ross Harvey. 2008. Organising Knowledge in Classification PDF files (web page). https://www.loc. a Global Society: Principles and Practices in Libraries and Infor- gov/aba/publications/FreeLCC/LCC_Q2019TEXT. mation Centres. Rev. ed. Topics in Australasian Library and pdf Information Studies 29. Wagga Wagga: Centre for Infor- Losee, Robert M. 1995. “How to Study Classification Sys- mation Studies, Charles Sturt University. tems and their Appropriateness for Individual Institu- Hoskin, Michael. 2003. The History of Astronomy: A Very tions.” Cataloging & Classification Quarterly 19, nos. 3/4: Short Introduction. Oxford: Oxford University Press 45-58. doi: 10.1300/J104v19n03_05 Hunter, Eric J. 2018. Classification Made Simple. 2nd ed. Lon- McBride, Neil and Iain Gilmour, eds. 2004. An Introduction to don: Routledge. the Solar System. Cambridge: Cambridge University Press. Ibekwe-SanJuan, Fidelia. 2008. “The Impact of Geographic Mendes, Luiz H., Jennie Quiñonez-Skinner and Danielle Location on the Development of a Specialty Field: A Skaggs. 2009. “Subjecting the Catalog to Tagging." Li- Case Study of Sloan Digital Sky Survey in Astronomy.” brary Hi Tech 27: 30-41. Knowledge Organization 35, : 239- 50. Mills, Jack. 2004. “Faceted Classification and Logical Divi- INSPEC. 1995. Classification 1995: A Classification Scheme for sion in Information Retrieval.” Library Trends 52, no. 3: the INSPEC Database. [London]: INSPEC. 541-70. INSPEC. 1999. Classification 1999: A Classification Scheme for Pera, Maria Soledad, William Lund and Yiu‐Kai Ng. 2009. the INSPEC Database. [London]: INSPEC. “A Sophisticated Library Search Strategy using Folk- INSPEC. 2004. INSPEC Classification 2004. [London]: sonomies and Similarity Matching.” Journal of the American INSPEC. Society for Information Science and Technology 60: 1392-1406. INSPEC. 1976. Concordance to the INSPEC Classification, Ricketts, Sandra, Christina Birdie, and Eva Isaksson. 2007. 1969-1976. London: INSPEC. “Ontologies for Astronomy.” In Library and Information Institute of Astronomy. 2017. “Dewhirst Classification,” Services in Astronomy V, Common Challenges, Uncommon Solu- Library Guide (web page). http://www.ast.cam.ac.uk/ tions: Proceedings of a Conference co-hosted by the Harvard-Smith- library/guide/dewhirst. sonian Center for Astrophysics and Massachusetts Institute of Institution of Electrical Engineers. 1988. INSPEC Classifi- Technology Cambridge, 18-21 June 2006, ed. Sandra Ricketts, cation: A Classification Scheme for Physics, Electrotechnology, Christina Birdie and Eva Isaksson. ASP Conference Se- Computers, and Control. [London]: Institution of Electri- ries 377. San Francisco: Astronomical Society of the Pa- cal Engineers. cific, 193-6. Institution of Electrical Engineers. 1982. INSPEC Classifi- Ranganathan, Shiyali Ramamrita. 1937. Prolegomena to Li- cation: A Classification Scheme for Physics, Electrotechnology, brary Classification. Madras: Madras Library Association. Computers, and Control. [London]: Institution of Electri- Rowley, Jennifer and John Farrow. 2000. Organizing cal Engineers. Knowledge: An Introduction to Managing Access to Information. Institution of Electrical Engineers. 1981. INSPEC Classifica- 3rd ed. Aldershot: Ashgate Publishing. tion: A Classification Scheme for Physics, Electrotechnology, Com- 278 Knowl. Org. 46(2019)No.4 E. Quinlan and P. Rafferty. Astronomy Classification: Towards a Faceted Classification Scheme

Rowley, Jennifer and Richard Hartley. 2008. Organizing Unified Astronomy Thesaurus. 2015. “Unified Astronomy Knowledge: An Introduction to Managing Access to Information. Thesaurus v.1 is Here!.” Unified Astronomy Thesaurus 4th ed. Aldershot: Ashgate Publishing. (blog). December 23. http://astrothesaurus.org/blog/ Sayers, W. C. Berwick. (1926) 1975. Sayers’ Manual of Classi- Unified Astronomy Thesaurus. 2017. “Version 2.0.0 of the fication for Librarians. 5th ed. [rev. by] Arthur Maltby. Lon- Unified Astronomy Thesaurus.” Unified Astronomy Thesau- don: André Deutsch. rus (blog). January 31. http://astrothesaurus.org/blog/ Tunkelang, Daniel.2009. Faceted Search. Synthesis Lectures Vickery, Brian. 1960. Faceted Classification: A Guide to Con- on Information Concepts, Retrieval, and Services 5. San struction and Use of Special Schemes. London: Aslib. Rafael: Morgan & Claypool. Wilkins, George. 1989. “The Revision of the Universal Dec- Nasir Uddin, Mohammad and Paul Janecek. 2007 “The imal Classification for Astronomy.” International Astronom- Implementation of Faceted Classification in Web Site ical Union Colloquium 110: 70-71. https://doi.org/10. Searching and Browsing.” Online Information Review 31, 1017/S0252921100002955 no. 2: 218-33. Wilkins, George. 1995. “Revision of UDC 52 for Astron- Unified Astronomy Thesaurus. n.d., s.v. “Hierarchical Browse: omy: Report on the “Birds of Feather” session at LISA- Supernovae,” Accessed October 13, 2017. http://astro II.” Vistas in Astronomy 39, no. 2: 263-264. thesaurus.org/thesaurus/hierarchical-browse/

Knowl. Org. 46(2019)No.4 279 H. D. White. Patrick Wilson

Patrick Wilson† Howard D. White Drexel University, College of Computing and Informatics, Philadelphia, PA 19104, USA,

Howard D. White, a professor emeritus in Drexel University’s College of Computing and Informatics, has re- ceived both the Derek de Solla Price Memorial Medal in scientometrics (2005) and the Award of Merit (2004) from the Association for Information Science and Technology. His PhD in librarianship (1974) is from the University of California at Berkeley, where he was Patrick Wilson’s doctoral student and co-taught a seminar with him. Many years later he co-authored a book, For Information Specialists: Interpretations of Reference and Bibliographic Work, with Marcia J. Bates and Wilson, and he wrote the introduction to Wilson’s autobiographical memoir.

White, Howard D. 2019. “Patrick Wilson.” Knowledge Organization 46(4): 279-307. 75 references. DOI:10.5771/0943- 7444-2019-4-279.

Abstract: During 1965-2001, Patrick Wilson brought the acuity of a professional philosopher to library and information science (LIS) and became a major theorist in many aspects of knowledge organization (KO). This article, an extensive critical introduction to his thought, reflects the view that much of his work is of permanent value. He can be read for well-informed critiques of the instruments by which writings are organized for retrieval—the bibliographical side of KO. He can also be read for shrewd accounts of personal knowledge and behavior with respect to societal information systems—the social-epistemological side of KO. Indeed, in his work the two sides converge. One of his themes is the preferability of human consultants over bibliographies and catalogs for answering questions. He thus writes at length about the social organization of possible consultants and their degrees of cognitive authority in communicating what they know. Another theme is the desirability of indexing writings not only by subject but also by their possible utility in helping individuals. For that, however, he saw little hope. A third theme is ideal information systems. Broadly, he can be read for his clarifications of concepts on both sides of KO, such as bibliographical control, relevance, subject indeterminacy, infor- mation needs, information overload, librarians’ roles, and LIS as a field.

Received: 20 April 2019; Accepted: 29 April 2019

Keywords: Patrick Wilson, information, knowledge, subject, knowledge, knowledge organization

† Derived from the article of similar title in the ISKO Encyclopedia of Knowledge Organization, Version 1.0 published 2019-05-07. Article category: Biographical articles

Patrick Wilson (1927–2003) art- ones. Throughout his works he is hard on librarians insofar fully brought rigorous attention to as their professional literatures hold out false promises for certain fundamental problems of their services, but he is equally hard on information scien- knowledge organization in library tists insofar as their professional literatures rest on glib as- and information science (LIS). His sumptions about what their algorithms or hypothetical sys- post-doctoral career spanned the tems will do. In both fields, he undertook to deflate unwar- 1960-2000 epoch in which the hu- ranted claims and to temper even warranted claims with man-literature interface became modesty. He was of a pragmatic and skeptical turn of mind. the human-computer-literature in- Wilson’s background was unusual among information terface. Although his work largely scientists, many of whom come from the sciences or engi- antedates the web and Google, his analytical abilities, in- neering. His bachelor’s and PhD degrees, both from the formed by very wide reading, are such that many of his University of California at Berkeley, were in philosophy. works are likely to last. That is because he excelled at de- He had as well a bachelor’s degree in library science from scribing people’s situations vis-à-vis information services Berkeley and experience in various Berkeley library jobs. that the latest technology does not necessarily improve. He These included part-time map cataloging while in school used philosophical reflection and thought experiments ra- and then professional positions in reference librarianship ther than empirical techniques to arrive at these descriptions, (1953) and South Asian studies librarianship (1954-1959). but in so doing he drew extensively on empirical research by In the latter position, he published three large bibliog- others. A recurrent strategy of his is to characterize ideal raphies (Wilson 1956; 1957a, b) while also writing a disser- services as a way of revealing the shortcomings of actual tation in the Anglo-American tradition of concept analysis 280 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

(Wilson 1960a). He subsequently taught philosophy during scriptions for retrieval—that is, with products long associ- 1960-1965 at the University of California at Los Angeles, ated with LIS (Hjørland 2016; Zeng 2008). But Hjørland and his first publications—on J. L. Austin, W. V. O. Quine, (2003, 2008) and Andersen and Skouvig (2006) argue for a and aesthetics—appeared in philosophy journals (Wilson broader interpretation of KO—one that goes beyond the 1960b, 1965, 1966a). Given his earlier jobs, however, he bibliographical concerns of LIS to relate knowledge to was uniquely qualified to do something new—that is, to persons, groups, practices, and institutions in society. This analyze as a philosopher what he had learned as a bibliog- sense has much in common with the field of social episte- rapher (Wilson 1998, 307-308). His conclusions, moreo- mology (Goldman and Blanchard 2018), which examines ver, could be extended to all organizers of writings and who knows what and how they know it. In Wilson, the two thence to libraries and information services in the wider conceptions of KO converge. He wrote, for example context of their users and non-users. (2KoP, 118), “The use of bibliographical instruments is After transferring from UCLA to the faculty of Berke- frequently a stupid activity, as is, I suspect, known more or ley’s library school in 1965, he taught cataloging and pub- less clearly to many scholars, and provides an excellent rea- lished his first book, a treatise on bibliographical control son why they should not do more of it.” The present ac- called Two Kinds of Power (Wilson 1968). 2KoP’s forceful ab- count portrays both the bibliographical side and the social- stractions have won it many admirers (e.g., Smiraglia 2007, epistemological side of his writings, with emphasis on 2014), but its immediate forerunner was concrete and their fusion (see also Hjørland 1996; Munch-Petersen practical: a long, multidisciplinary, multilingual bibliog- 1996; Andersen 2004; Furner 2010). raphy on South Asian science (Wilson 1966b). His major creative period was 1968-1983, during which his three 2.0 Consultants and aids books and most influential papers appeared, but he also developed many fresh ideas in the papers and book re- An early non-philosophical work of Wilson’s hints at his views of 1984-2001 (e.g., his analysis of copyright in Wil- subsequent thought. The first words of his “Introduction” son 1990). A conference honoring his contributions to LIS to South Asia: A Selected Bibliography on India, Pakistan, Ceylon was held in Sweden in 1993 (Olaisen et al. 1996). In a late (1957a, 1) are: “If one intended to read only one book on memoir that is the best short account of his intellectual India, that book should be Nehru’s Discovery of India, an life (Wilson 1998), he calls himself, dryly, “a bibliographer inside view of Indian history and civilization by its most among catalogers.” A long, fascinating set of interviews he prominent spokesman.” The “Introduction” is in fact a gave in an oral history project (Wilson 2000) is titled Patrick three-page bibliographical essay in which Wilson briefly G. Wilson, Philosopher of Information: An Eclectic Imprint on states what various titles are good for or why they might Berkeley’s School of Librarianship, 1965-1991. He was dean of interest the reader. However, the basis for these recom- that school (now the School of Information) during 1970- mendations is a forty-one-page, single-spaced, largely un- 1975 and its acting dean during 1989-1991. In 2001, the annotated list of publications assigned to form classes American Society for Information Science and Technol- (e.g., “Periodicals”) or broad subject headings (e.g., “His- ogy gave him its highest honor for career achievement, the tory—Kashmir”) in the manner of a library catalog. The Award of Merit. His acceptance speech (Wilson 2001a) “Introduction” thus seems an attempt to superimpose on brilliantly distills the range of problems that attracted him. the aridly impersonal bibliography the face of a well-read What follows moves freely across his writings to extract advisor—someone concerned with the uses of publica- themes and sometimes to contest points that bear on tions as well as formulaic descriptions of them. knowledge organization (KO). Responses to his work by This evokes Wilson’s distinction in 2KoP between con- later writers are selectively cited but not discussed. Neither sultants and aids. Regarding a subject literature, he says are his twenty-five book reviews (with one exception), but (116), the consultant or advisor is “able to say where to start, they appear in the Appendix. A superb writer, Wilson elab- and whether starting was worthwhile, whether one might orates and qualifies his ideas in considerable detail, and his expect to find much or little of value and where one might arguments and occasionally amusing examples can be read expect to find it. He would be able to understand our pur- for pleasure. The suasion of his style is largely lost in the poses, and make reasonable suggestions, if not specific rec- present overview, but his own prose will often be quoted ommendations, about the best ways of attaining them; but (italics in the quotations are his). His preferred form “bib- he might also suggest that our purpose was unattainable, or liographical” has been adopted here, except when he or that no textual means would be likely to be of much value.” others use “bibliographic.” He also used “he” and “a man” The consultant, in other words, has read or read about a fair in the old-fashioned way to stand for both sexes. number of items in the literature—knows them from the In its well-established narrower sense, KO deals princi- inside, so to speak—and can assess and prioritize them on pally with describing documents and organizing the de- an inquirer’s behalf, possibly including the one best thing to Knowl. Org. 46(2019)No.4 281 H. D. White. Patrick Wilson read. Suppose, for example, a reader complained that the How society organizes potential consultants is discussed Nehru book is too long. A responsive consultant could iden- at length in Wilson’s 1977 book, Public Knowledge, Private Igno- tify its most pertinent parts for that reader or name a shorter rance (PKPI), under the headings “Specialists in Knowledge” but still reputable history of India. and “The Social Organization of Knowledge.” Not surpris- The aid, a relative outsider, is much more limited in such ingly, the availability of helpful information or advice on dealings. For instance, if a student in 1960 had wanted a various matters is shaped most strongly by occupational book on the history of Kashmir, the aid might produce structure: people know what their jobs require them to the titles under that heading in Wilson’s (1957a) bibliog- know. Within this structure, librarians are prepared to give raphy, but could do little more than that, having formed help of three limited kinds (100-107): no opinions about them. The aid (2KoP, 117): – Bibliographical assistance. Staff in special librar- can discover for us answers to bibliographical ques- ies may search literatures (as in Wilson 1992) and tions if the answers can be got immediately or me- prepare bibliographies for researchers, but service diately from bibliographical instruments …. He is at this level is generally reserved for the fortunate one who can do those things which can be done on few. Neither public nor academic libraries are the basis of knowledge of the specifications of bib- staffed to give such time-consuming help to their liographical instruments, a minimum of general numerous, relatively unsophisticated users. Ra- knowledge, and the specific instructions of the per- ther, librarians in these settings (and most others) son he is aiding. simply refer their users to existing bibliographical tools or to areas of the collection where self-ser- “Bibliographical instrument” is Wilson’s generic term for vice may be productive. tools such as free-standing bibliographies, printed or digital – Question answering. Librarians do try to answer library catalogs, indexes, guides to literatures, and journals some non-bibliographical questions directly. That of abstracts. The “specifications” of such instruments (59- is, for customers with questions about specific 62) he defines as: 1) the domain (i.e., the set of writings) matters of fact, they will search for answers in from which their contents are drawn; 2) the principles for ready-reference tools such as almanacs and at- selecting their contents; 3) what counts as a listable unit in lases. Nowadays, of course, people look up their them; 4) how these units are routinely described; and, 5) how own answers on the web, and even when Wilson the descriptions of the units are organized (cf. Bates 1976) was writing, librarians’ ready-reference services (more on specifications later). The aid might know instru- were hardly over-used. But beyond the shallow ments in this sense better than the consultant does, but that nature of these services, Wilson notes that, as a is not enough to make the aid more helpful. rule, librarians were not prepared to vouch for the Wilson regards consultants and aids as ideal types that accuracy of their answers. They searched only un- real people only approximate. The consultant resembles a til an answer had apparently been found and al- scholar or subject expert in a given field; the aid, a librarian most never tested it for correctness across more with bibliographical instruments and a collection in that than one source. Independent checks thus quite field. As Wilson knew, there are scholar-librarians who can often showed their answers to be wrong—some- serve as consultants in their areas of expertise (he was one thing neither they nor their customers had sus- himself). However, deep subject knowledge is not generally pected. At most, librarians consulted works they presumed in librarians; it is not part of their professional deemed authoritative and then attributed those image, so to speak. Librarians are front-line specialists in tex- sources in their replies. This failure to assume re- tual metadata. They are trained to acquire, create, and use sponsibility for the quality of their answers casts sources in which writings are characterized, but which do doubt on their professional status. “People talk la- not directly answer most non-bibliographical questions. Ac- zily of libraries as storehouses of information,” cordingly, when librarians seek to teach their potential cus- writes Wilson (1998, 311), “but they contain at tomers what they know (informally or in classes), they dis- least as much misinformation as information, and cuss sources in which answers to questions might be sought, the problem is to tell the one from the other.” should the need arise. By contrast, consultants might use – Selection assistance. On many occasions, people their own expertise to simply answer the question, obviating would welcome trustworthy advice on what to further inquiry. Wearing another hat, consultants might also read or where to find the best information. Con- synthesize research results for others, thereby conserving sultants steeped in particular writings can usually their reading time, which is another skill not ordinarily ex- perform this service for people better than bibli- pected of librarians. ographical aids. Unfortunately, consultants like 282 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

this are often undiscoverable (or, if found, una- choice might have been even better. Wilson (1978, 20-21) vailable). Librarians, by contrast, are easy to find, observes of subjective satisfaction in general: “There are and if they could complement their collections obvious reasons why one should take care to see that users with dependable recommendations, many people of information systems are satisfied, but it is not obvious would benefit. However, librarians are not and that their satisfaction should be the goal of the system; ra- cannot be universal experts, with detailed, accu- ther, it is the satisfaction of their needs and wants that rate subject knowledge across many fields. What should be the goal.” He means satisfaction that is logically, a librarian can do, Wilson notes (PKPI, 105) “is to not psychologically, related to fulfilling needs or wants, be- produce others’ recommendations—reviews, lists cause the psychological kind may be illusory. of recommended readings, lists of standard or ca- More broadly, who is competent to evaluate Wilson’s nonical writings. He may be able to say, like an advice? Some experts on Indian history might agree with assistant in a book store, that one title is very pop- him, but others might favor other introductions; for any ular, another has been well reviewed, and another consultant, these evaluations are matters of opinion, not has apparently a good scholarly reputation. Again, fact. Unless success can be measured by some objective the librarian avoids making an independent judg- test of utility, all we can hope for in grappling with super- ment on the accuracy and trustworthiness of a abundant writings are consultants’ best guesses on what to text; he reports the views of others, or gives the read. patron a collection and lets the patron do the de- ciding.” 3.0 Modeling information seekers

Librarians do sometimes recommend other persons as The foregoing account illustrates Wilson’s fusion of social sources of information, but here, too, they typically act and bibliographical themes. Stated tersely: more as aids than as consultants. “If I ask to be referred to a personal information source,” writes Wilson (107), “I – Instruments such as bibliographies and catalogs un- do not expect to be referred to an arbitrary source, but to doubtedly have their uses, but for many questions, per- the best, or at least a good, source. I do not want a list, say, sons are preferable sources of answers, if such persons of doctors or lawyers; I can find that in the telephone can be found. book. I want to be told which is a good one. Even if there – Characterizations of writings by their potential utility to is only one agency or personal source for some sort of us are preferable to neutral bibliographical descriptions information, I want to know whether it is any good or of them. But then someone must evaluate writings for whether I would do better to avoid it. This sort of advice that purpose. is not, so far as one can tell from published literature, of- – While people frequently have questions that writings fered by libraries.” can answer, most people do not want long lists of pos- So, are consultants always more effective? No, because sible things to read. They want the one best thing to they cannot guarantee their advice either—cannot guaran- read, which again involves critical evaluation. tee that it will produce successful outcomes. In his chapter – In all of these matters, what we would ideally like from on “Reliability” in 2KoP, Wilson points out that, except for information services, including those in libraries, is not relatively simple problems, there are no clear tests of suc- what we can routinely get. cess in advising readers (126): “If my adviser tells me that a certain work is worth examining, and I do look at it but The notion of the one best thing to read, such as Nehru find nothing in it to my purpose, the outcome is perhaps on India, is contextualized in Wilson’s discussion of library no success but neither is it a ‘failure’ (except, perhaps, on users and non-users in any large (PKPI, 94-99). my part), and does nothing at all to discredit the advice.” Here he posits a variable called studiousness. This is “the A recent illustration: after confessing an “embarrassing” number of sources [i.e., full-text documents] one is pre- inability to read poetry, the author Amy Chua (2018) says, pared to use together in relation to a single decision prob- “A good friend gave me Edward Hirsch’s How to Read a lem.” The individuals who are unwilling or unable to study Poem, which I read and still have on my shelf, but it didn’t any document in relation to their problems are studious to work.” Or take Wilson’s own claim that Nehru’s book is degree zero. Individuals “prepared to study a single source, the best introduction to Indian history. Suppose someone but no more” are studious in the first degree. For them, begins it and gives up; is this a failure on Wilson’s part? No, even two documents are too many (94-95): his advice remains justifiable. On the other hand, suppose someone enjoys and learns from the book; does satisfac- If we use two sources together and both tell us the tion with it prove Wilson right? Not altogether; another same thing, the second source has added nothing ex- Knowl. Org. 46(2019)No.4 283 H. D. White. Patrick Wilson

cept, perhaps, a degree of confirmation. If the two quality. Items that would be jointly useful if found may be tell us different things, however, the work of the ad- hard to find because they are topically dissimilar. Also, ditional job of comparison, reconciliation, and deci- many items in the complete library may be written in tech- sion of which to believe is added. nical vocabularies or foreign languages the user does not understand or understands only with difficulty. If the lan- Finally, “those willing to use together any number of doc- guage of the items is understood, the user may still lack uments,” are “studious in the nth degree,” where n desig- enough background to evaluate what is claimed in them. nates the tolerable number of documents. More generally, the user must be asking the right question Since each new document adds effort to reaching a de- and willing to commit non-negligible time and effort to it. cision, degrees of studiousness are distributed very une- And lastly, the content of the library items must be rea- venly across the population. Wilson imagines a falloff in sonably accurate and not false. Wilson’s compact presen- which the largest group of people would be at degree zero, tation of these problems amounts to a rationale for avoid- the next largest group would be at the first degree, and ing large libraries whenever possible. As such, it contrib- then the group frequencies would sharply decline right- utes to basic information behavior theory. ward as the count of documents to be studied increased. Libraries and reading can be integrated in a general Wilson does not put it this way, but it appears that, were model of personal information systems (36-39). Most actual data available, the frequencies would form a reverse- adults have internal images of the world that are more or J or power-law curve of the sort common in LIS. less well developed, but whose particulars are continually The studiousness variable can be used to partition any updated by information from various sources. Wilson as- society’s members. Potential consultants on what to read signs these sources to three systems: can be defined as persons who have already shown them- selves to be studious to various degrees in certain litera- – The monitor system. Everyday means of monitoring tures. A library’s potential customers can be imagined as our surroundings are, first, observation and, second, zero-book, one-book, and multi-book people—the first communication with others. For example, we might being mostly unreachable and the last frequently made up routinely check certain places, talk to certain persons, of aspiring or actual knowledge workers and decision- and follow certain media reports. An inventory of our makers (While Wilson’s oeuvre deals most often with monitoring systems at a given time would list these knowledge workers, he also supervised Elfreda Chatman’s sources, the topics associated with them, and the fre- (1983) dissertation on the working poor, many of whom quency with which they are used. Our habits are shaped are in the zero-book category. She became known for her by the perceived utility and quality of the information writings on how several such groups seek and use infor- they supply. mation). In this context it is the one-book people—those – The reserve system. We also know of non-monitored studious in the first degree—who interest Wilson as he sources we can turn to if needed. We value having these considers potential users of libraries as information cen- potential sources in reserve even if that need never ters. These users rarely if ever need what he calls “the com- comes. For the great majority of people, libraries and plete library”—that is, a large set of deep collections ac- the items they hold, such as reference works or data- cessed through complex bibliographical instruments. In- bases, fall in this category. (So would web reference stead, they need relatively small collections of readable sites.) “single-package” works that deal with commonplace prob- – The advisory system. Wilson emphasizes the third sys- lems and that can be accessed through browsing. But this, tem because it can supply not only information but Wilson observes, is precisely what typical bookstores also counsel on what to do in problematic situations. While offer. Bookstores do not eliminate the need for public li- both persons and writings may qualify in the advisory braries, but they put continual pressure on them to justify role, persons are much more important, he says (38), their economic existence. because they “can fit advice to the circumstances of the If people have practical decisions to make (Wilson’s ex- particular case and the particular time, as documentary ample is, “Should I sell my car and use public transporta- sources cannot with any exactness. Documentary ad- tion?”), why would they not benefit from having a large vice must be more or less impersonal, directed to cir- array of well-indexed collections at their disposal? The cumstances of a given type. Whether our own circum- PKPI chapter on this question—”Access to the Complete stances fit the type is exactly what one needs to know Library”—answers largely in terms of mismatches (88-93). but cannot find out from documentary sources.” And The user may lack the right search terms to find items rel- again (40): “We can converse with people and (often) evant to the decision. Topically-organized bibliographies get quick answers. We can ask them, in effect, to reor- may not align well with it. On-topic items may be low in ganize what they know to bring it to bear on a problem 284 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

and to select from their stock of knowledge the things racy, adaptability, and scholarship. Simultaneously, read- that we should know. We can ask them to use our prob- ers must consider the utility of texts in light of their lems, our interests, and our capacities as bases for the own interests, knowledge, and capacities. Their con- selection, organization, and presentation of part of cluding step is to decide how well the texts have actually their stock of knowledge, or as bases for the giving of served them (22-23). definite advice. They are supple and adaptive sources of – Descriptive control is the power to line up information, as documentary sources are not. Anything of writings that meet an evaluatively neutral descrip- a personal informant or adviser might tell us could be tion—neutral in the sense that no one has appraised part of a documentary record, but documents do not their likelihood of helping the reader, or even how well reorganize themselves and rewrite themselves on de- they actually fit their descriptions. In its ideal form, the mand to fit new questions.” wielder of this power “can have summoned up every writing that fits his arbitrary description so long as the 4.0 Bibliographical control applicability of the description can be discovered with- out any consideration of virtues or vices or utilities” For Wilson, stocks of knowledge in the traditional philo- (25). As examples of neutral descriptions, Wilson gives sophical sense of “true, warranted beliefs” reside only in (22) “authored by Hobbes,” or “discusses the doctrine people’s heads (PKPI, 4). But writings can represent peo- of eternal recurrence,” or “contains the word ‘fatuity.’” ple’s knowledge—and their non-knowledge as well, such Writings with these features we can imagine being re- as their opinions, conjectures, fantasies, and false beliefs. trieved through explicit term-matches in bibliographical Moreover, the same writing will frequently mix knowledge indexes or full-text databases. Perhaps the most familiar with non-knowledge, and because the two are rarely implementations of descriptive control are instruments flagged as such in texts, the differences between them are that characterize items by their genres, authors, titles, by no means necessarily apparent. “Since there is no mark dates, and subjects. As a matter of policy, the most de- by which we humans can recognize the truth when we see sirable expansion of such control for Wilson (147-148) it, we have invariably to make do with the best opinion we would be greater revelation of the subject matter in can get, the best attested opinion” (2KoP, 27). Thus, to call writings. bodies of writings in their entirety “public knowledge” is to mislead (PKPI, 4-5). Yet by common consent, innumer- Descriptive control may be identical with exploitative con- able writings are worth reading. They pass the test for pub- trol when retrieving neutrally described texts is an end in lic knowledge when that is defined not as absolute truth, itself—for example, if I want the first edition of a play, or but as (5) “the view of the world that is the best we can a book that, by the virtue of full-text indexing, contains a construct at a given time, judged by our own best proce- textual string that I supply. But, in general, such control is dures for criticism and evaluation of the published rec- weak at identifying writings by personalized function— ord.” Wilson’s ideal when both function and personalization are Given that people with one or more degrees of studi- taken seriously. Suppose I want to remedy my shyness, and ousness often read for practical purposes, they naturally I look for “self-help” books on that problem. For me at seek the writings that will advance these purposes the least, the promise of any book so labeled may be decep- most. As noted, librarians have traditionally tried to assist tive; as in Amy Chua’s case, it does not help me help my- them by providing instruments that characterize published self. Or suppose I want the best introduction to economics writings of all sorts. Taken jointly, these characterizations I can find, and I look for textbooks with “economics” and bring writings under what librarians call bibliographical “introduction” in the title (26). It hardly needs saying that control. In 2KoP, Wilson reimagines such control as two this is not a surefire route to what, for me, would be the kinds of power that a reader might have. most suitable introduction—the one best thing to read. Exploitative control has its parallel in the services of – Exploitative control is the power to obtain the best tex- consultants; descriptive control, in the services of aids. tual means to an end. In its ideal form, the wielder of it The better power to have, obviously, is exploitative con- (25) “has merely to say what he wants the writings for, trol. Wilson comments (26): “The only reason for wanting and is then provided with whatever will suit that pur- the ability to line up a population in arbitrary ways is that pose best, whatever it is.” In practice, exploitative con- one lacks the other power, and has oneself to attempt dis- trol depends on evaluating texts for their potential to covery of the best textual means to one’s ends by scrutiny help specific readers. Consultants may attempt to do of members of various neutrally described classes of the this; more often, readers will attempt it themselves, by population.” Exploitative control is not imaginary; readers considering texts for virtues such as intelligibility, accu- frequently do find the best textual means to ends. For in- Knowl. Org. 46(2019)No.4 285 H. D. White. Patrick Wilson stance, they find guides that not only lay out the steps for Recall that, in Wilsonian specifications, the set of doc- doing something, but that lead to success in doing it—e.g., uments considered for inclusion in an instrument is called kitchen recipes, statistical algorithms, parts catalogs, avion- a domain. Bibliographers create instruments by selecting ics manuals. However, in incalculably many cases, exploi- documents from a domain on grounds such as their lan- tative control exists only as an ideal. What is more, the guage, form class, subject matter, time of publication, and “best textual means,” even if found, may be unrecogniza- audience level. Then the bibliographers’ specifications of ble as such (30). The situation resembles that of readers domain and selection principles, if trustworthy, imply that with respect to consultants: success in achieving goals is they have included all the documents in the domain that not guaranteed, if only because readers vary so much in met their selection criteria, and that further searches over the qualities they bring to text-based endeavors. Wilson that domain are not needed. Explicit specifications thus nevertheless equates consultants with exploitative control increase the powers that bibliographical instruments pro- (149), since it is obvious “that the use of bibliographical vide by licensing certain inferences. For example, Wilson apparatus is not an activity engaged in for its own sake, distinguishes on this basis between an inconclusive litera- that it is an activity that people will avoid so far as they can, ture search and a search that is a negative success (2KoP, and that it is in general more pleasant, more efficient, and 58-59). A negative success occurs when we can infer that quicker to ask a question of a person likely to know the documents meeting our criteria do not exist, because bib- answer than laboriously to seek the answer in catalogs and liographers have established that fact through prior bibliographies.” searches based on their specifications. A search is incon- Both kinds of power can be assessed on certain dimen- clusive when we cannot tell whether documents remain to sions (34-39): the populations they would serve, how reli- be discovered, because bibliographers have not stated their ably they can be exercised, the extent of writings they procedures, leaving us up in the air. cover, their versatility in meeting demands of different Two further examples: specification of the units listed sorts, and the nature of items supplied under them, from in the instrument (e.g., “books and articles only”) allows us vague sets of unlocated titles at one end, to copies of full to infer that other potentially valuable items must be found texts for personal ownership at the other. Wilson con- elsewhere. Specification of the routine descriptions of cludes his argument by imagining, as a rhetorical device, items (e.g., by author, date, and so on) allows us to infer omnipotence on these dimensions (39-40): that the absence of a descriptive feature (e.g., date) means that that feature is absent in the document and not simply If I had the greatest conceivable degree of exploita- omitted by accident. Wilson’s remarks on specifying how tive control, I would be able to have the best means bibliographical instruments are organized by subject will to my and everyone else’s ends supplied instantane- be taken up in Sections 6 and 7. ously, effortlessly, with absolute reliability, the supply Bibliographers and librarians essentially do the same consisting of the most suitable copy or performance thing, says Wilson (1998, 309), in that both groups search in the bibliographical universe. If I had the greatest and analyze files of writings, select items for inclusion in conceivable degree of descriptive control, I could new contexts, describe the items, and organize the descrip- have supplied, under analogous conditions, items tions. Their only real difference is that bibliographers make satisfying or fitting any neutral or non-evaluative de- virtual collections of documents, whereas librarians make scription whatever. actual ones. But when Wilson first taught at Berkeley, ideas like these were not routinely part of library school courses. A fantasy, of course, but it jolts us into thinking about ac- In his words: “my aim was to end the isolation of catalog- tual bibliographical instruments in terms of the powers ing and classification instruction from questions of policy they give. Take, for instance, a large library’s online catalog. and alternative practices, to try to prevent students (and How does it perform on the dimensions of exploitative teachers) from thinking of the subject matter as just tech- control? Of descriptive control? Are its objectives even nical routine to be mastered and get them to think of it as stated? What would be feasible advances in its capabilities? a central part of a very large, complex system of biblio- How is it linked to other bibliographical tools? By what graphical organization.” As “a bibliographer among cata- criteria can its successes and failures be judged? Pursuing logers,” he was familiar with the latter’s tendency to focus questions of this sort, one sees the relevance of knowing on detailed lore about current practices, a sort of myopia the general rules by which the catalog was constructed. he opposed. His own mind was stocked with examples of One wants its specifications, which is why Wilson argues librarians’ follies; for instance, as a Berkeley librarian he that makers of bibliographical instruments should state had been assigned to make entries for a labor-intensive cat- them. He himself did this to some extent (Wilson 1956, v- alog of maps that no one ever used (305), yet he also knew vi; 1966b, vii-x), but the practice is far from universal. of valuable books that were not findable because they ap- 286 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson peared in monographic series, and library policy at the time quirements for Bibliographic Records, or FRBR (Coyle 2016), was to catalog the series by name rather than the books which did not appear until the 1990s. (2KoP, 61). Thus, his implicit question to the makers of “The Catalog as Access Mechanism” (Wilson 1983a) any instrument is always: “Why are you doing this in this subverts received ideas by taking them literally. In Charles way? What purpose does it serve?” By stressing the design A. Cutter’s nineteenth-century dictum, the catalog’s first and critical evaluation of bibliographical instruments, and objective is “to enable a person to find a book of which not solely their maintenance, he was performing the phi- either the author, the title, or the subject is known.” Taking losopher’s job of teaching his readers how to think. “find” literally, the card catalog did not suggest the where- abouts of any book not in its assigned place on the shelf. 5.0 Reimagining cataloging Nor did it lead to books not owned by the library but avail- able through interlibrary loan or some other means. Nor Wilson’s proposals for library catalogs were visionary in did it lead to texts of the same work held by the library their time, and some still are. These instruments put ex- (e.g., Macbeth) if they did not occupy a whole book or were plicit descriptions of published items in specific arrange- not in foreseeable volumes (e.g., Shakespeare’s Tragedies). For ments or give the descriptions specific points of access. example, while analytics on a catalog card might reveal that Naturally this triggers policy questions like “What items Plays of the Supernatural (an imaginary anthology) contains should a catalog cover?” and “How should the items be Macbeth, analytics are not access points, and someone who described?” Well into Wilson’s career, the answers presup- did not already know that anthology by title or editor posed print technology, publication in book format, and would not find that text of Macbeth. Why, Wilson asks, card catalogs. Strictly speaking, books merely package con- should one text of the work be cataloged but not another? tent; they are not identical with what is packaged. Yet Cutter’s second objective for the catalog is “to show what books were the unit cataloged (and remain so), partly be- the library has by a given author, or on a given subject, or cause of their visibility and tangibility in collections. In in a given kind of literature.” Taking literally “show what breaking with this past, Wilson prepared his readers for the library has,” the card catalog did not show authors’ new possibilities of computerization. works published, e.g., in serials or as book chapters. It also The first chapter of 2KoP is called “The Bibliographical notoriously failed to show everything the library has on Universe.” The items of this universe are not books but given subjects. For example, under the principle of specific intangible writings (or recorded sayings), decoupled from entry (Wilson 1979a), books were assigned the most spe- any particular storage medium (6-11). A given writing may cific subject heading that covered the entire book, and so be regarded as a work (a linguistic composition of any a book on, say, political polling would receive a heading length judged more or less complete by its producer), as a indicating that topic. However, someone who assumed text (an abstract string of linguistic symbols in a certain that everything the library had on political polling ap- sequence), as an exemplar (the union of a text with a du- peared under that heading would not find a book on, say, rable storage medium), and as a copy (a reproduction of American political history that had rich material on it (The an exemplar). The set of copies made from a single exem- catalog’s failures in revealing Cutter’s “kinds of literatures” plar is an edition. Applying this chain to typical books is are taken up below). straightforward; for practical purposes, there is only one “The Second Objective” (Wilson 1989a) continued to edition—one work, one text, one exemplar, and one set of examine books vs. works in light of the increasing capabil- copies. But for other publications, the chain is much more ities of computers and telecommunications. Those capa- complex: many valuable works consist of families of texts bilities were creating virtual libraries of e-texts that could that vary among themselves; the texts appear in different be stored, copied, and read anywhere; one no longer had exemplars; editions made from diverse exemplars prolifer- to visit an actual library to obtain copies. This called into ate across locations; significant intertextual ties exist question Seymour Lubetzky’s 1953 formulation of a cata- among different works, and so on. The cataloger attempts log’s two objectives, as phrased by Wilson (7): “the first, to to bring order to this complexity so that works are discov- ‘enable the user … to determine readily whether or not the erable and copies of them are findable. But note that li- library has the book he wants’; the second, to reveal what brary users typically want a copy of a work; its medium of works the library has by a given author and what editions storage (e.g., in a journal or a book) is secondary. Thus, the or translations of a given work.” The first objective lessens work-text-exemplar-copy chain opens up works of any in force as library ownership of physical copies lessens in length to cataloging; they need not be published as books importance. However, given the priority of works for us- to qualify. Note, too, that Wilson’s 1968 work-text-exem- ers, Wilson’s main assertion is that the second objective plar-copy chain anticipates the work-expression-manifes- should actually be the first. That is, publication in book tation-item chain of the computer-oriented Functional Re- format should no longer be a screening device for deter- Knowl. Org. 46(2019)No.4 287 H. D. White. Patrick Wilson mining what writings are cataloged; the same instrument divisions to topical headings for works. What is implied, can record and unite an author’s books, articles, book W&R ask, if a work has no form subdivision added to its chapters, and papers in an “index-catalog.” (Today’s re- record? They conclude that there are no “generic” works search libraries, e.g., Berkeley’s, are moving in this direc- that cannot be cataloged by form. Rather, there are simply tion.) Thus, the work would replace the book as the unit works for which proper form headings are as yet uncreated cataloged. The first part of the catalog record would pre- or unavailable for free assignment. Through a process of sent the work’s author and title in standardized form, de- elimination, they determine some of these to be compo- scribe its content, and give (9) “historical or contextual in- site works, such as non-literary anthologies. Others are formation relating to its creation. The second part would “single complete factual discursive” works, such as how- be open ended, a potentially growing locating record tell- to-do-it guides and book-length introductions-to-some- ing us that the work appears in such and such a book, also thing. These and many other potential form headings are in such and such a journal, and in such and such a micro- already in use by publishers, authors, reviewers, and read- fiche collection, and so on.” ers. They are also evaluatively neutral. The question then is Importantly, virtual copies of a work would be added why the Library of Congress leaves unnecessary gaps in its to its locating record. Historical or contextual information repertory of form headings—a question still with us today about a work could define it in terms of sequences of tex- (For some critical responses to Wilson’s ideas on catalog- tual states. A finished, stable work emerges from drafts and ing, see Yee 1995 and Svenonius 2000). may also be released in new editions, all of which join its sequence of states. In a growing work (11-12), “parts al- 6.0 Subject indication ready completed are stable and new parts are continually added.” In a changing work (e.g., a database), “parts already Large-scale provision of subject access to writings in- produced are changed, new parts, are added, and old parts volves characterizing them with terms that supposedly ex- are subtracted”; every update thus represents a new copy press their degrees of topical similarity and that also map of the work. This opens new challenges in characterizing onto people’s interests. Put differently, the terms label unstable works bibliographically, a problem less salient in places in pre-arranged schemes such as subject-heading the days before computers. A last consideration (14-15) in- lists, thesauri, and classification schedules, and writings are volves what “smaller” genres might be cataloged, such as assigned to those places in bibliographical instruments. short stories, poems, book reviews, newspaper articles, and Describing a hypothetical subject scheme (2KoP, 66), letters to the editor. While adding these to the main index- Wilson makes the point, important for KO, that such catalog would make it unmanageably large, there is no rea- schemes indeed list subjects, not concepts. In so doing, he son in principle why they could not be cataloged as above distinguishes between understanding the meaning of and linked to the main file in files of their own. terms, and using those terms to refer to writings. Subjects A more specialized essay on the second objective (Wilson are indicated by the act of referring. For example, suppose 1989b) responds to Ákos Domanovszky’s proposals for cat- an imaginary book called Flames tells the story of altar can- aloging editions of a work (as in the Lubetzky quote). dles. Then by assigning the book to “Altar candles” in a Briefly, Wilson counter-proposes a policy (347) that would: scheme, one is in effect referring to its subject matter—to a) represent distinct works separately; b) label as identical the things that Flames itself refers to at length. The term “Altar different editions of a work that have fully or nearly identical candles” also has one or more meanings for the scheme’s texts; and, c) bring together works that are strongly related users (perhaps aided by a scope note), and if one chooses, by criteria other than textual identity. Item (c) means that these meanings can be related to the concept of altar can- works that derive from a core text, such as translations or dles. However, Flames is not about how one understands adaptations of a classic, should be linked to it. This has long the term or demonstrates that understanding, as if a con- been accomplished by giving classics uniform titles and then cept were being analyzed; it is about altar candles in the cataloging derivative works under these titles. Item (b) is world. “One can write about concepts,” Wilson says, “but much more novel; to this day, catalogers do not label identi- most writings are not about concepts, but about other cal texts across editions. Yet many potential readers would sorts of things, for instance, water, queens, candles.” Fol- like to know that one text of a work is substitutable for an- lowing his logic, it appears that, even in writings about con- other (e.g., the proceedings version of a paper for the jour- cepts, authors are referring to concepts as their subject nal version), especially if the two are not equally accessible. matter, and bibliographical terms that echo that fact would Wilson and Robinson (1990) found a state of incom- simply be subject indicators, not “concept indicators.” pleteness in the Library of Congress headings that identify This would also hold if “concept” is merely being used to Cutter’s “kinds of literatures” by form (e.g., directories) or mean a complex abstract idea, such as “similarity” or “au- genre (e.g., fiction). Catalogers typically add these as sub- tism” or “democracy.” 288 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

The example of Flames has the advantage of seeming Here is Wilson’s main argument verbatim but recast as very straightforward, but, in Wilson’s view (69), biblio- bulleted points (90-91): graphical instruments that indicate subjects are “the most difficult to make and the least generally satisfactory.” He – If position is assigned on the basis of identifica- explains why in several of his works but especially in 2KoP. tion of some determinate feature of writings, we can know that items at a position will share fea- 6.1 Subjects of entire writings tures in common, and in some respect differ from items located elsewhere. The 2KoP chapter “Subjects and the Sense of Position” – But what can we predict about what items at a analyzes the situation of those who assign entire writings position will have in common, that will distin- to places in subject schemes. These bibliographers (to call guish them from items everywhere else, if posi- them that) have already placed millions of works and con- tion is assigned on the basis of identification of tinually add more; characterization by subject would, there- subject? fore, seem to pose few difficulties. But how, Wilson coun- – Of the items at other positions, some might have ters, do bibliographers decide the subject or subjects of been assigned to this position if a different writings so as to assign them to one or more positions? He method had been employed of identifying sub- calls the matter “deeply obscure” and notes that no manual jects; in fact offers rules on how to do it (70-71). Nor are such – items at other positions may resemble some of rules ever likely to be found, because the very notion of the items at this position more closely than the “subject” (or “topic” or “aboutness”) in writings, while not items at this position resemble each other, and meaningless, is inherently vague. The common intuition – this not because of mistake on the part of the that bibliographers can identify “the” subject of a writing locator, but because of the indeterminacy of the requires them to choose a labeled position that precisely notion of the subject of a writing. describes the work as a whole. Far from being easy in all – No single feature, and no cluster of features, set cases, Wilson writes (89), “The notion of the subject of a off writings at one position from those at all writing is indeterminate, in the following respect: there other positions; may be cases in which it is impossible in principle to decide – the rules of assignment prescribe nothing defi- which of two different and equally precise descriptions is nite, and no confident predictions can be made a description of the subject of a writing or if the writing about what will be found in the writings at a given has two subjects rather than one.” place. Suppose bibliographers could obtain lists of terms that – So the place has no definite sense. identified: a) everything the writing explicitly mentions; and b) all its implicit concepts (i.e., abstractions inferred Wilson’s critique is most applicable to subject classification from its text without being mentioned in it). Wilson calls and cataloging of books in libraries; the thesaurus-based such a list the writing’s “cast of characters” (77-78). But indexing of journal literatures in the sciences, e.g., medi- even the “cast of characters” for a writing would not nec- cine, is probably more predictable. Nevertheless, his broad essarily lead bibliographers to its unique subject; in fact, account of subject indeterminacy explains the tendency of the “cast” would likely contain multiple equally precise de- bibliographers to assign writings inconsistently (cf. Wilson scriptions of it, thereby complicating placements. The far 1992, 168). This is a problem hidden by the easy match in more limited information that bibliographers actually work the “Flames–Altar candles” example. with is still equivocal as to “the” subject (or subjects) of a There is no one best way to ascertain subjects. In an writing. The heart of the difficulty is that bibliographers’ analysis that has influenced other writers (e.g., Hjørland guidelines do not link the labeled positions in subject 2001; Andersen 2004; Joudrey and Taylor 2018), Wilson schemes to any consistent set of documentary features. If describes four methods by which bibliographers might in- they did, specifications to that effect could appear in bibli- fer where writings should be placed (78-89). All have flaws, ographical instruments, but of course they do not. In con- and all might yield different assignments for the same trast to, say, biological classifications of plants and animals, work. The four will be briefly paraphrased as directions, which are feature-based, the signs of aboutness in writings with Wilson’s caveats, introduced by “However,” immedi- are left to bibliographers’ own judgments, and no feature ately following. or set of features in a writing determines what they might infer or wish to express about a work. At most, they oper- – Authorial purpose. Look for authors’ own state- ate by in-house conventions and precedents rather than by ments of their primary purpose in writing—the rules that everyone understands. “master plan” that governs the work as a whole. Knowl. Org. 46(2019)No.4 289 H. D. White. Patrick Wilson

However, works often have more than one pur- to allow or require the making of fine distinctions. They pose, and the main one cannot always be readily find a location which satisfies them, and count this a suc- identified, especially if the purposes are inter- cess.” This imperfect solution still reigns, whether bibliog- linked. Other works may have purposes that are raphers are classifying books (under the maxim “mark it indefinite or shifting or mischaracterized by the and park it”) or cataloging them under one or more subject author (e.g., Goodman 2019 notes a misleading headings. subtitle), which clouds placement decisions. – Figure-ground perception. Look for the text’s 6.2 Subjects of parts of writings dominant entities—those foregrounded in the ex- position, as opposed to others treated as back- In the chapter “Indexing, Coupling, Hunting,” Wilson ground. However (83), “Dominance is not simple shifts to ways of making parts of writings—passages of omnipresence; what we recognize as dominant is various lengths—retrievable by subject, on the ground that what captures or dominates our attention, but we these may be at least as valuable as writings in their entirety cannot expect that everyone’s attention will be (93). He starts with two possible strategies. The first is to dominated by the same things.” divide a writing into paragraphs (or other small stretches) – Reference-counts. Look for the items in the text and assign each to a single, finely discriminating subject that names, words, and associated pronouns most position. But paragraphs, like entire works, are nebulous to frequently refer to—i.e., estimate the counts. assign, and authors’ inconsistent styles of paragraphing do However, this does not guarantee that figure and not help. The second strategy is to greatly increase the sub- ground will be clearly distinguished (83): “The ject terms applied to the writing overall—to assign it, that constantly-referred-to item might be merely a is (94), “to as many positions as we like or can afford.” background item, as a history of happenings in The latter set of positions would, at the extreme, be the Petrograd might mention Petrograd constantly “cast of characters” for the writing—all its implicit con- while the action was described in terms of a suc- cepts and explicit mentions. Could concepts from the cession of different persons and their various do- “casts” of writings be merged to create a true “concept ings.” Items frequently mentioned in a work— bibliography”? Wilson rejects the idea as delusory (95). He e.g., a person’s relatives—might also be grouped also rejects the idea that every explicit mention of some- by bibliographers in equally plausible but arbi- thing might be valuable, giving as cautionary examples “a trary ways. At the same time, an apt subject term hundred thousand mentions of Dante” (95) and “all dis- for a work might never occur in its text at all— coverable discussions of the freedom of the will” (137- e.g., the phrase “political career” in a work wholly 143). When he wrote, concordances existed, but keyword concerned with incidents in someone’s political indexing of full texts by computer was hardly dreamed of. career. Now, explicit mentions in the “casts” of digitized writings – Unifying rule. Look for a rule that seems to unite yield enormous retrievals. Entering “Dante” in Google the elements of a work into a coherent whole— currently produces 224 million documents. Entering “free- for example, its principle of inclusion and exclu- dom of the will” produces 13.2 million. In a sense, the sion or the scope of the questions it answers. quality of such retrievals depends on how well people use Such characterizations may not be made by au- keywords to index their own purposes. Even so, they tend thors themselves, but they can be inferred. How- to ignore all but a tiny fraction of Google’s indiscriminate ever, this again requires bibliographers to impose search results, and they may also dismiss the texts (e.g., their own insights onto authors’ texts, and deci- Wikipedia) that the search algorithm ranks highest. sions as to subject placements may again be arbi- There is thus still a place for human indexers guided by trary—”a piece of artistry on our part” Wilson time-honored criteria for indicating subjects. “Internal cri- says (88), “rather than on the part of the writer.” teria” Wilson writes (98), “are those whose application re- quires looking at nothing but the writing being judged; ex- The four methods all presume too much reading and cog- ternal criteria are those whose application requires looking itation to be feasible; they reflect principled judgments in beyond the writing itself.” theory, not what is or ought to be done in practice. Wilson His internal criterion for indexers is the apparent im- knew full well that real-world bibliographers (91) “do not portance of discussions within texts. One test of this is have time to brood over alternative possibilities, nor do perceived indispensability; that is, would removing a dis- they need, in most cases, to attempt a very precise descrip- cussion greatly affect the text’s overall meaning? If so, the tion of subjects. It is their job to locate items quickly, and discussion should be indexed. A perhaps quicker test is the organizational schemes they use are mostly too coarse simply to observe the page-space devoted to something: 290 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson the greater the space, the greater its importance. But while the world and their inferential powers to make headway. In the latter test seems sensible, it is not always easy to decide Wilson’s terms of art, one of their tasks is “hunting”—i.e., where a discussion begins and ends. First, a subject must trying to predict likely subject positions under which to be identified, with all the difficulties that poses. Once that look (To convey the difficulties involved, he presents the is done, direct references to the subject may be visible, but case of someone searching Dewey classification positions what about the indirect ones? What about passages for items on the history of the stirrup). Another task is strongly associated with it by implication? Nor is discus- “picking”—i.e., trying to decide whether writings are re- sion length a reliable indicator in all cases; something very trieval-worthy when descriptions of their contents are in- brief, e.g., a sentence or a few numbers in a table, could be adequate. the most retrieval-worthy item in the text. Judgments of Because bibliographers know that inferring subject po- textual importance on internal grounds, Wilson says, are sitions is problem-ridden, they usually provide auxiliary essentially aesthetic in nature; they resemble the opinions tools to facilitate hunting (105-109). They supplement an of editors refining a manuscript for publication. instrument’s main arrangement of positions with alpha- Judgments on external grounds are not aesthetic but so- betical or classified indexes. They also explicitly refer cial; indexers should bring out whatever in texts has poten- searchers from one position to another to remind them tial value to readers. Depending on the intended audience, where similar writings may be found, e.g., X, see also Y. an indexer can highlight quite different aspects of the same Wilson calls these latter linkages “couplings,” and he dis- text (98): “An indexer who knows the active interests of tinguishes three sorts. Analytic couplings show semantic some group of people will count as important enough to or logical ties between terms (e.g., synonyms, wholes and mention whatever he thinks would be seized on by one parts, genera and species). Factual or synthetic couplings with those interests.” A reader thus may care nothing link commonplace matters of fact (e.g., Pierpont Morgan about the length or dispensability of a passage as long as and bankers; diamonds and cutting tools). However, the it is personally engaging. In the social case, a practical dis- relations revealed by links of these first two sorts are sel- tinction is between indexing for a broad group (e.g., a dom news; one knows many analytic couplings simply by whole discipline) and indexing for a few specialists or even knowing a language, and many factual couplings simply by one individual. The indexer’s understanding of readers’ having a standard mental encyclopedia. Of greater value differing goals and interests would then shape the criteria are what Wilson calls overlap couplings, since they can re- of importance. For instance, a new text might have one set veal similar writings occurring in unfamiliar or unexpected of implications for the discipline and another for the few positions. His example is the overlap between histories of specialists, and the two groups would want indexers to re- Sanskrit literature and histories of Indian medicine. spond accordingly. As a “General Rule of Hunting,” Wilson proposes Ultimately, however, the notion of “importance” is like (110) that “Discussions of a thing X are more likely to be the notion of “subject” in that it is not linked to any de- found in the context of discussions of a thing Y, the more terminate set of features. It, therefore, cannot be captured closely related Y is to X.” But even to guess at X-Y rela- in bibliographical specifications or in instructions to index- tionships, searchers need background knowledge, and in ers; what they do remains an art (100): “We can give long this they vary greatly. Can bibliographers, therefore, couple lists of examples of things to look for, but at the end of the most closely related subject positions on their behalf? our list we must say ‘and so forth,’ trusting to the wit of If so, how can closeness be estimated over vast numbers the indexer to extend the list, or to see how it could be of subjects? Especially, how can valuable overlap cou- extended.” In this case, the slipperiness of “importance” plings be made? Wilson in 1968 nibbled at the edges of contributes to the inconsistent results that indexers pro- certain statistical solutions then available (e.g., Kessler’s duce. It also implies that the relatively impersonal indexing bibliographic coupling), but he did not foresee all the pow- for a group (e.g., members of a discipline) might be wholly ers that computerizing bibliographical texts and then full or largely useless for a particular member of that group. texts would bring. That is, although he knew about word Wilson then analyzes the situation of any individual occurrence counts, he did not foresee the benefits of hav- faced with impersonal subject indexing—that is, anyone ing co-occurrence counts instantly available in very large searching for writings on a subject in large bibliographical databases—counts of co-occurring descriptors, co-cita- instruments. Such writings are defined by the searcher’s in- tions, and the like. Co-occurrences can show perceived terests. If he or she can find these writings simply by con- overlaps, and the higher the count, the closer the coupling. sulting a known subject position, or simply by reading fur- For example, books or articles described as histories of ther descriptions of writings at that position (e.g., ab- Sanskrit literature might be frequently co-cited with books stracts, excerpts), all is well. However, searchers ignorant or articles described as histories of Indian medicine, and of terms and placements must rely on their knowledge of bibliographers would not need to detect this overlap them- Knowl. Org. 46(2019)No.4 291 H. D. White. Patrick Wilson selves; it would be automatically created by scholarly citers. Under both rules, the places in the scheme are labeled the The availability of large-scale co-occurrence data does not same, and many writings might be assigned to the same solve all problems, of course, as Wilson would be the first place in both The Catalog and The Encyclopedia. Given to note. But he would have to ponder decades of statistical their different criteria, however, it is at least conceivable solutions, including automatic term-weighting schemes that no writing would occupy the same place in both—that now standard, if he were writing a new essay on biblio- is, every writing would be primarily useful for studying graphical control. some subject other than its own. More likely, Wilson ex- He did, however, deliver one verdict in his final book plains (67), this would occasionally happen because “the review (Wilson 2001b). His main criticism of The Intellectual utility of a writing, if any, is by no means bound to lie in Foundation of Information Organization by Elaine Svenonius is its contribution to the understanding of its subject. If I am that, in a world of “self-describing” digital documents, it seriously interested in the study of, say, concept formation accepts 150 years of subject organization in libraries as se- among young children, I may get no help from the writings cure. The continuing scarcity of instructions on how bib- whose subject that is, but much help from writings whose liographers should use the traditional schemes (204) subject is chimpanzees.” “ought to raise eyebrows: those secure foundations had lit- To use either system properly, users need to grasp its tle useful to say about the application of subject descrip- rules of assignment. Ideally, these would be explained in a tions? Time, then, to start afresh.” specification as to how the bibliography is organized. Un- instructed persons would presumably find The Encyclo- 7.0 The catalog vs. the encyclopedia pedia harder to interpret and use than The Catalog, since the titles of writings (indicating subjects) would more fre- Real-world bibliographers apply subject terms inconsist- quently clash with the place-labels (indicating utilities). But ently in part because they lack explicit rules of procedure. The Catalog could also pose serious problems to users, Wilson takes up this matter in 2KoP by performing a such as guessing the right level of generality for terms in thought experiment with a pair of imaginary instruments subject searches. Wilson, therefore, warns that (67-68): that do have explicit rules. Using distinctive capitalization, he calls them The Catalog, which affords descriptive con- Unless we understand the rules of assignment, in- trol of writings by subject, and The Bibliographical Ency- cluding the rules that interpret the descriptive labels clopedia, which affords exploitative control of writings by if there are any, we cannot know what it means about utility (65-70). While the two might list the same writings, an item that it is assigned a particular place, we can- they would support different kinds of lookups, because not know what inferences we can draw about it and they are indexed by different rules. about the items which are not at its place. So we do Wilson first asks us to imagine an indexing scheme with not know what we are finding, and what we cannot many different labeled places (perhaps with interpretive expect to find, when we see an item at a place. comments added). The place-labels—i.e., indexing terms— can be names or descriptions of anything we like. In both When a writing could plausibly go in two or three places, The Catalog and The Bibliographical Encyclopedia (hence- Wilson imagines that the subject indexers of his thought forth simply “The Encyclopedia”), writings are indexed by experiment might not always follow the subject rule. Ra- assigning them to places from the scheme. However, to con- ther, they might arbitrarily switch to the utility rule and put struct The Catalog (66): it where it will “do the most good” (67). This is the inde- terminacy factor in action. Assign an item to a place N, just in case the descrip- Indeterminacy can be demonstrated in real-world prac- tion that identifies N is a closer description of the tice. To adapt an example from White (1992, 103), the Li- subject of the item than is any other description in brary of Congress Subject Headings is a large, complex scheme the list. with a place labeled “Social Surveys.” Subject catalogers have assigned to it: 1) works that discuss techniques for Whereas to construct The Encyclopedia (66): doing surveys; 2) works that assemble re-usable question- naires and scales; and, 3) works that report results of sur- Assign an item to a place N, just in case the primary veys. Jumbling these three distinct genres under one label utility of the item lies in the help it would give to one shows the label’s indeterminate meaning for both cata- engaged in the serious study of the thing mentioned logers and catalog users (cf. the example of items under by the descriptive label that identifies N. the label “Economics” in 2KoP, 64).

292 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

Where, then, might the three jumbled groups of works Assign a biography of Einstein to “new methods of be placed if the rules for The Catalog and The Encyclope- biographical investigation” because you are aware it dia were strictly interpreted? The first group comprises qualifies as such. methodological items that are on social survey research and that also assist in the study of such research as their Utility indexing as in these examples requires indexers to primary utility. So, they could appropriately be assigned to imagine new functions of writings, and this kind of index- “Social Surveys” in both instruments. But the second and ing cannot be routinized over vast bodies of texts in the third groups are not on social surveys as a subject; they are same way as subject indexing. How, we may ask, can ordi- on whatever the questionnaires and scales measure, or nary subject catalogers—or any indexers—be expected to whatever the surveys were about. Thus, terms reflecting predict the “most significant use” of all the writings they their actual subjects, such as “sexual discrimination” or must process under time pressure? Moreover, whose most “attitudes toward foreigners,” would suit them best in The significant use? Could they ever know enough to judge Catalog. By contrast, the questionnaires and scales were every document in light of its “logical relevance to prob- used in surveys, and the completed surveys exemplify that lems and projects”? This is rather like expecting them to form of research. Since their primary utility or function connect hitherto unconnected literatures, the problem would lie in the study of past or future surveys, assigning identified in Swanson (1986). Suppose I, as an indexer, read them to “Social Surveys” in The Encyclopedia seems ap- the item on chimpanzees but have no clever ideas on non- propriate. obvious uses for it, or I read the Einstein book but am un- The Catalog and The Encyclopedia are again contrasted aware of its contribution to biographical method. By the in Wilson (1978), although not by those names. There, Wil- rule for The Encyclopedia, I would still have to put them son describes two systems with identical indexing vocabu- somewhere, and here the possibility for idiosyncratic laries; in one, the documents are grouped by topic; in the guesses and mistakes seems great: for instance, I might as- other, by most significant use. He illustrates with a biog- sign a book on Bayesian statistics to “information retrieval” raphy of Einstein that would be indexed under “Einstein” because I am ignorant of its relevance to other fields. More- as a topic, but as “an example of a new method of bio- over, because my notions of utility would be unexplained, graphical investigation” as its most significant use. users of The Encyclopedia would have no quick way of In the same paper, Wilson amended the Catalog/Ency- learning why I assigned an item to a place. Worse, they clopedia distinction, now claiming that merely indexing a could never be sure where to look for something. document by subject brings out its initial primary utility— Much of the time, Wilson treats subject indexing and that is, as a source of information on that subject. “Any utility indexing as if they were equally feasible. Yet he knew further use the document has will depend on first putting they are not, as Wilson (1980, 18) shows: it to this use, by reading and understanding it, by gathering the information it contains; this is the sense in which its It is a great challenge to librarians and bibliographers use as information source is its primary use” (21). But if to provide what I call a “functional approach” to The Catalog does this, The Encyclopedia is worth compil- documents (Wilson 1978), and what Swanson calls ing only if it brings out additional utilities. Wilson says (23) “problem-oriented access” to literature (Swanson these might be “descriptions of logical relevance of docu- 1980, 112), in which documents are described not, ments to projects or problems.” or not merely, as being about such and such a topic Real-world subject catalogers have always implicitly fol- but as being of likely use in an inquiry of such and lowed a rule that approximates the one for The Catalog. such a sort. I agree with Swanson that hope for ma- Without reading anything—there isn’t time—they simply jor advances in such a direction may be illusory, my re-express or copy key phrases (e.g., title words) from doc- reason being that functional or problem-oriented or- uments in authorized indexing vocabularies. By contrast, ganization of literature requires guessing about fu- the rule for makers of The Encyclopedia is not one that ture utilities, and people are not very good at doing ordinary subject catalogers can readily follow. To do so, this. they would need to read documents, then exhibit consult- ant-like knowledge in many topics and sometimes unusual Indeed, he admitted in Wilson (1983a, 15) that, for librari- creativity as well. Rephrasing Wilson’s examples: ans to adopt a functional approach to writings in a tool like The Encyclopedia, they would “have to start not with par- Assign a writing on chimpanzees to “concept for- ticular books, but with particular questions or problems, mation in children” because that insight occurs to and ask about each book, What if anything might this you. book contribute to solving or clarifying this particular problem?” Since this approach to KO would clearly re- Knowl. Org. 46(2019)No.4 293 H. D. White. Patrick Wilson quire impossible amounts of time and manpower, librari- ings, they add links in the complicated network of biblio- ans settle for “an instrument that is fatally flawed from the graphical connections, a network the tracing of which in serious user’s point of view.” His bleak summary: “We the informal apparatus may be more valuable, if more can’t provide evaluations, and can’t organize materials time-consuming, than any use of the formal apparatus.” functionally, in terms of uses to which they can be put ra- Then, in a distinction paralleling that between subject ex- ther than topics they’re about.” Wilson’s Berkeley col- perts and indexers (or consultants and aids), he immedi- leagues M. E. Maron and William S. Cooper also proposed ately adds: “if a man evaluates a work on which he has models of indexing that required unrealistic predictions labored for days or years, his evaluation has a greater prima from indexers, and he himself politely undermined their facie claim to be taken seriously than does that of one who work (Wilson 1968, 96; 1978, 14-15; 1979b). had, by the magnitude of his task, to evaluate quickly and Thus, while The Catalog has many instantiations in the superficially an enormous number of writings.” real world, The Encyclopedia has none. Writings, it ap- Given Wilson’s appreciation of references (seen again pears, could be indexed by their utilities only if the data in Wilson 1983a, 6; 1983b [1992], 243), it is remarkable that were a by-product of some other activity and the process he wrote so little about citation indexing. He identified his could be automated. As it happens, however, there has “functional indexing” with Swanson’s (1980) “problem- long been a form of utility indexing that meets these re- oriented indexing,” but Swanson’s own real-world example quirements and that also draws on the knowledge and cre- of the latter is Eugene Garfield’s citation indexing. Gar- ativity of consultant-like experts rather than aid-like index- field had contrasted his innovation with conventional sub- ers. That is citation indexing. ject indexing in papers from 1954 onward, and many other authors had joined him in exploring the features of cita- 8.0 Utility indexing and citation indexing tion networks. In fact, Wilson’s career coincides exactly with the growth of the modern citation-analytic literature, In The Encyclopedia, a lone indexer predicts the future util- yet he remained aloof from it. He cites a few bibliometri- ity of a work, whereas in citation indexes, citers demonstrate cians here and there; he describes two modes of citation the work’s past utility—its actual use history—in contexts retrieval in Wilson (1992, 156); and he briefly discusses from which various functions of the cited work may be in- bibliometrics and citation analysis in characterizing LIS ferred. The same work frequently has multiple citation con- (Wilson 1983c, 1996a). But he excluded detailed treat- texts. Uncited works do not appear in these indexes, of ments of citation indexing from his discussions of biblio- course, but assuming a work is cited in the first place, cita- graphical utility. That is a gap, since indexing by citation tion indexes are arguably the richer form of utility indexing. links (and later by web links) is the sole major complement In any case, they are the only systematic form we have. to indexing by subject indicators, and analysis of As said earlier, certain writings misassigned to “Social “citances”—the sentences in which citations are embed- Surveys” by The Catalog rule would be properly assigned ded—adds to our exploitative control of writings. there by The Encyclopedia rule. This claim can be linked In two instances, Wilson used his own experience to to citation indexing if we imagine that The Encyclopedia’s show the limitations of topical indexing. These very exam- labels are followed by explanatory chapters. Then the ex- ples make his silence on citation indexing puzzling. In the pert author of the chapter on “Social Surveys” could cite social sciences, he notes (1980, 18): works on social surveys, or used in them, or exemplifying them, or even unrelated to them but helpful in making a Work that should be read may not be read for many point. The prose contexts of these various citations would reasons, including the reason that there was no way often suggest “the logical relevance of works to projects one could have discovered it using only bibliograph- or problems”—here, to social survey research. More gen- ical access systems based, as ours are, on topical in- erally, they would imply that authors were using—and pos- dexing—one may be unable to guess the topic of sibly evaluating—works for specific ends, a kind of exploi- work that would actually be of crucial importance to tative control that others, too, might adopt. one's own research. I would have been quite unable Wilson fully realized the value of authors’ references in to predict the topics of all the works I have found KO. Although he devotes most of 2KoP to what he calls useful in working on this essay and would not have the formal bibliographical apparatus, such as free-standing found them through a conventional subject index. bibliographies and catalogs, he is at pains to note that the informal apparatus of references (i.e., citations) in learned Wilson (1983b [1992], 244) further notes that subject literatures is potentially far more important (58): “Insofar searches may lead to what has been explicitly said about a as the parts of the informal apparatus refer to other works topic, but are no help to someone interested in what might and specifically evaluate or reply to or build on other writ- cast light on it: 294 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

In this kind of case, the texts which you are looking 1125-1126). Of the twenty-three papers that had cited Hans for are texts that are functionally related to your ques- Selye’s classic endocrinological paper “The general adapta- tion, but that need not be topically related. You want tion syndrome,” none of them appeared under “Adaptation” material you can use, and the things you can use may in Index Medicus, and none of them is clearly related to Selye’s well have topics that are apparently quite unrelated global topic. Instead, they provide evidence for Selye’s theory to the topic of your question. For example, I recently from a variety of fields—an extremely valuable kind of func- came upon a paper on misleading metaphors in lin- tional information that subject indexing and see-also refer- guistics that I find enormously useful in understand- ences would have missed. Compare Wilson (1973, 460): ing certain problems in information science. No train of see-also references could be expected to It must be obvious that the concept of evidential rel- connect these topics. evance is also of central concern in information re- trieval. It is clearly a desirable characteristic of an in- The paper Wilson refers to is Reddy (1979). formation retrieval system that it be able to provide It is true that conventional subject indexes would not information that could help one arrive at conclu- have led him to the rich array of references with which he sions or reasoned opinions even in cases where con- supported Wilson (1980) or to Reddy’s stimulating paper. clusive arguments are unobtainable. Yet the functional relationships he perceived are not lost; he himself preserved them. The references in his 1980 pa- The reasons for Wilson’s reticence on citation indexing can per now lead backward to the earlier works he cited, and only be guessed. The Science Citation Index (SCI) began in the works he cited now lead forward, through citation in- 1964, and he knew it from both reading and personal ex- dexes, to his own 1980 paper. These references are texts amination. He also supervised a two-volume dissertation that cast light on topics without being on them. The same on it by Theodora Hodges (1972). This pioneering work, would hold for the paper on misleading metaphors, had he massively documented and thoroughly Wilsonian in char- cited it explicitly (He did cite it in Wilson 1983d, 11). acter, reached conclusions moderately favorable to citation The earliest major article on citation indexing, Garfield indexing and retrieval; in essence, SCI is good but not (1955 [2006], 1123), distinguished between topic and func- great. Every scholar interviewed by Hodges valued the fa- tion in a way analogous to Wilson’s: miliar network of references to earlier works from a work in hand. But that, she argues, is because such references If one considers the book as the macro unit of are embedded in contexts that help to evaluate them. By thought and the periodical article the micro unit of contrast, SCI shows the later works that cite an earlier one, thought, then the citation index in some respects but not the contexts in which it was cited; users must do deals in the submicro or molecular unit of thought. further lookups to evaluate the function and worth of each It is here that most [subject] indexes are inadequate, citation. SCI-style retrieval is thus noisy, and the evaluation because the scientist is quite often concerned with a of functions in multiple contexts is very complicated. Wil- particular idea rather than with a complete concept. son may have thought these conclusions by Hodges made further comment on his part unnecessary. Garfield implies that, whereas subject indexing is applied Then there is the matter of citation counts. Hodges to whole works, citations relate to authors’ discussions in noted the research involving them but found it unconvinc- passages, which need not topically resemble the whole ing. Wilson apparently shared this staunchly humanist work in any way. opinion. In Wilson (1980, 6) he is skeptical about biblio- Even an army of subject indexers, says Garfield (1955 metric counts in general, after earlier dismissing citation [2006], 1123), could not feasibly index passages. Strikingly, counts in particular as a substitute for evaluative judgments however, “By using authors’ references in compiling the ci- by individuals (PKPI, 7): tation index, we are in reality utilizing an army of indexers, for every time an author makes a reference, he is in effect The scientist who publishes his results presumably indexing that work from his point of view.” This is the key wants to influence his colleagues and make a contri- insight; as noted above, the contexts of citations imply rhe- bution to knowledge. If his work is unread, the first torical functions that cited works perform for citers. The aim is not attained, but the second may still be. [Then functions reflect citers’ perspectives and may or may not be in an endnote:] This is one strong reason for resist- related to the citing work’s global topic. While this is not ex- ing the claim that citation counts, that is, counts of ploitative control in Wilson’s strict sense, it contributes to the frequency with which a piece of scientific work that power in ways that matching a person’s subject request is referred to in subsequent publications, are an ade- does not. The point is exemplified in Garfield (1955 [2006], quate measure of the value of scientific work. Knowl. Org. 46(2019)No.4 295 H. D. White. Patrick Wilson

A writing’s citation count roughly indicates its popularity gence, connoting that they have been evaluated for the among experts. But in Wilson’s discussion of exploitative confidence we can place in them and their appropriateness control, popularity of this sort is another criterion he re- to our situations. jects—a point relevant here. He gives the example of a man Wilson’s (1973, 468) basic model is the intelligence sup- who wants the best books on Cretan history (2KoP, 35): plied by human advisors, with or without computer sup- port. Some of his formulations will be glossed in due We might take a request for the best books on Cretan course; others will already be familiar: history to be a request for those books that are most highly regarded by, say, “the experts” on Cretan his- Once the idea of situational relevance is set forth, tory. If that is the request, it can be filled without any and the corresponding idea of significant situation- evaluation, for to report on the popularity or stand- ally relevant information introduced, it is immedi- ing of a writing among a certain class of men is not ately apparent that information systems aimed at itself to evaluate the writing at all. providing the latter sort of information would be particularly desirable sorts of systems. Such systems, He might equally have written: “those books that are most supplying information rather than bibliographic ref- highly cited by, say, ‘the experts’ on Cretan history.” Recall erences, on a regular or “standing” basis, providing a that exploitative control involves evaluating a writing as a personal rather than impersonal approach, yielding means to a personal end, and, in that sense, a book with a information selected on the basis of logical relations high citation count might indeed not qualify. However, this to our concerns rather than on the basis of subject is to put logical consistency before pragmatic realism. Dec- matter, taking into account one’s state of knowledge, ades have passed since Wilson wrote, and bibliographical perhaps operating in a “tutorial” mode, modifying or advice tailored to individuals is still ad hoc and unsystem- reformulating information so as to be comprehensi- atic, if it is available at all. At the same time, formal and ble and acceptable to us (and hence of course also informal reviewing systems daily suggest various utilities capable of misleading and misinforming us, like any of writings to various readerships. Wilson himself recom- other tutor), would be of enormous power and util- mended the Nehru book to a readership and implied that ity. As noted, commercial and military intelligence Reddy’s paper on misleading metaphors might be “enor- systems aim to deliver this sort of information, and mously useful” to readers outside linguistics. If we turn to we rely on friends and colleagues to serve as sources citation counts as indicators of the general utility of spe- of such information. cific writings—as proxies, that is, for the advice of con- sultants—we find that Google Scholar currently has a Two of Wilson’s other glosses on relevance may be given count of about 3,000 for the Nehru book and about 4,000 in brief. In 1968, he is against using “relevant” to describe for the Reddy paper. Armies of citers have thus upheld a document that simply fits a topical description or is sat- Wilson’s evaluations from long ago. The citers’ form of isfactory to the requester. Instead, he wants to preserve the utility indexing, moreover, comes as an automatic by-prod- term’s traditional senses of counting for or against a claim, uct of everything else they were doing. or helping someone to solve a problem; the latter aligns with a document’s being the best textual means to an end 9.0 An ideal information system (2KoP, 43-53). A decade later, bowing to inveterate usage, he says in Wilson (1978, 16-19) that in information science Wilson’s (1973) most cited paper, “Situational Relevance,” “relevant” simply means “retrieval-worthy” and that one describes another ideal system—one that generalizes the way in which documents may meet this vague standard is notion of exploitative control of writings. What he envi- by being on a topic (Above, this was also called their initial sions is far from the delivery of bibliographical references primary utility). that match a user’s topical request. It more closely resem- His more stringent situational relevance in Wilson bles the “expert systems” that would flower in the 1980s. (1973) requires that system-supplied information must ad- His own expert system, an extraordinary one, gives us per- dress the concerns and preferences of specific individuals. sonalized answers to our dominant questions rather than Concerns are matters in which persons are not indifferent, things to read. Answers are information in the strong such as the state of their health or wealth. Persons prefer, sense, says Wilson (1978, 10), only if they are true, whereas that is, one state to another (461): “A feature or aspect of information in the weak sense is merely content, which can a situation will be said to be of concern to a person if the include misinformation. In Wilson (1973), the system’s an- feature can exhibit any one of several different specific swers have been critically evaluated so as to be as true (or states or conditions, and if the person cares which specific as warranted) as possible. More precisely, they are intelli- state or condition is the current one.” 296 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

Wilson assumes that specific states can be expressed as must report “a condition that is either higher or lower in a set of questions about which state is current. For each preference than the condition previously thought to exist question there is a set of answers, called the concern set, (represents, that is, a change for the better or for the with which the system can respond to the question, and worse),” or it must report “no change when a change for these answers must be at least partially rankable by per- better or worse had been expected.” Information that is sonal preference (i.e., they are not all tied). Jointly the an- only indirectly relevant may not change one’s view of the swers in the concern set describe a situation for a person. situation, but it, too, can be called significant if it changes Every person also has a stock of beliefs about the world the “confidence or probability” one assigns to items in the that includes beliefs about concerns. Then an answer from situation description. the concern set will be directly relevant to a belief if it in- Information systems are often said to aim at retrieving creases or decreases the belief’s degree of confirmation. items relevant to interests. In Wilson (1973, 464-465) in- If an answer that is not part of the concern set prompts terests and concerns may overlap or switch places, but they an inference that alters the belief ’s degree of confirmation, are not the same: “situational relevance [to concerns] de- it is indirectly relevant to the belief. pends on the existence of preferences about states of af- Wilson’s system is based on inductive logic—the logic fairs; interest depends on, or consists in, wanting to know of confirmation and disconfirmation by evidence, which about a thing, being curious about a thing.” One can be admits of degrees of probability. However, it is modeled interested in something (e.g., , Zen Buddhism) in part on a question-answering system by his Berkeley col- without preferring that it be one way or another. Wilson league William S. Cooper (1971) that defined the relevance (1977, 42) adds that concerns imply a commitment to act, of answers in terms of deductive logic—the logic of strict if necessary, to attain a more preferable state. Given such entailment. The following adapts one of Wilson’s own ex- commitments, a situationally relevant information system amples: tells users what they ought to know if actions are to be taken (The bank customer ought to know about the Belief about a concern: My bank balance today is at bounced check). But absent the commitment to act, inter- least $150. ests imply nothing that users logically ought to know (A Preference: I prefer $150 and all higher amounts to movie fan might like to learn about collections of film all lower amounts. noir, but that information is not imperative). Question: Is my bank balance at least $150 today? Chapter 2 of PKPI extensively analyzes personal infor- Directly relevant answer from concern set: Your mation systems, especially as they pertain to decision-mak- bank balance today is at least $150. ing. Describing “costly ignorance” in this context, Wilson Confirmation: Answer greatly increases probability writes (PKPI, 62): of my belief. Indirectly relevant answer not from concern set: We are sometimes sure that a piece of information Your check for $175 has just bounced. would have been crucial in the sense that without it, Confirmation: Answer greatly decreases probability a decision went one way, but with it, the decision of my belief. would have gone another way. When the outcome of the more informed decision would have been better For an answer—i.e., an item of information—to be added from our point of view than the outcome of the less to someone’s stock of beliefs (or knowledge), the person informed decision, a loss has been incurred. must learn of it and accept it as true. However, the item may be situationally relevant even if the person is unaware The loss is sometimes literally costly in money, but it gen- of it, because this is, once more, a matter of logic, not psy- eralizes to any concerning matter. On this basis, Wilson chology. For example, the bounced check is logically rele- sharpens the hazy notion of “information need” in LIS: vant to a concern with one’s bank balance even if one ig- “Crucial information, lack of which would result in a nores the bank’s alerts. Moreover, the logical relevance of worse decision, is needed information; information that is an item of information holds whether the situation is past, lacking but has no such effect is not needed” (PKPI, 63). present, or future, although one’s concerns and orders of He can thus define information need causally: information preference will naturally change over time. lacking —> poorer decision. If the deliberations preced- Wilson (1973, 467) calls information significant “if it is ing a decision can be cast as formal premises, he can also directly relevant situationally, and if it is new information define it logically: information lacking —> better decision to the recipient at the time of its receipt.” Novelty thus does not follow as a consequence. Needs thus character- figures in his account of relevance (as it does in many oth- ized are objective not subjective. Or as he puts it in Wilson ers in LIS). Significant information also revises beliefs. It (1978, 19-20), “Questions of need are factual questions Knowl. Org. 46(2019)No.4 297 H. D. White. Patrick Wilson about the relation of means to ends. It is worth insisting pecially for ancestral and donor studies, they say, such on this, in opposition to the common view that needs are searches are misguided because they ramify endlessly. Wil- subjective psychological states.” The latter are wants. son and Farid (129): “No research worker needs to be fa- Wilson (1978, 22-23) relates needs and wants to a simi- miliar with more than a small fraction of the work done lar ideal information system, while Wilson (1986) examines by others; nor is the same degree of familiarity always them in the context of rule-governed library reference ser- needed or sought.” Researchers know they must cite works vices. essential to their studies, but their further references to the literature are a matter of craft not obligation. 10.0 The view from R&D The downside of being studious in Wilson’s sense is that the more writings one tries to consider in a fixed period, the Despite Wilson’s preference for actionable intelligence more problematical reading and integrating them becomes. over bibliographical lists, he was always attuned to groups Yet the problem can be managed—for instance, by assem- for whom the “the literature” is not merely an interest but bling teams who jointly know a wider range of writings than a permanent concern. These are, broadly, research and de- any one member (cf. Wilson 1996b, 194-195), or by asking velopment workers, such as scientists, scholars, and tech- other researchers for summaries of current knowledge ra- nologists, literature-based professionals, and students as- ther than bibliographical advice, or by adopting the conven- piring to those fields. In general, R&D workers may be said tion that ignorance of certain literatures is permissible. An- to want news of significant, situationally relevant writ- other alternative is to read overviews of the literature rather ings—bibliographical intelligence, as it were—from their than primary research reports (Wilson 1983b [1992], 242). monitor or reserve or advisory systems. Wilson knew, of Librarians could help scientists in the latter regard, accord- course, that they rely more on personal exchanges than the ing to Wilson and Farid (143), by providing “more reviews, written archive for research-related news, but he also knew more authoritative critical surveys, more compendious they are not indifferent to documents that would advance works of reference, more works of haute vulgarisation, more their projects. At the same time, they must guard against works of synthesis” and by preparing “evaluative rather than having too many things to read. He thus devoted a series simply enumerative bibliographies.” But in the social sci- of papers—almost a short book’s worth—to their use of ences, writes Wilson (1980, 18-19), tactics like these will writings (Wilson and Farid 1979; Wilson 1980; 1983b work only if the primary reports are dispensable (as, say, pri- [1992]; 1993a, b; 1995; 1996b, c). These might be called “the mary documents in history are not). The sole innovative way overload papers,” and they are relentlessly deflationary. in which librarians could help social scientists is to assemble Wilson and Farid (1979, 128-132) analyze how individ- collections of materials hitherto scattered, so that they be- ual researchers avoid burdensome reading, given norma- come more convenient to use. tive expectations. As pragmatic skeptics, Wilson and Farid The norm that researchers will have current knowledge mostly deny that these norms are—or need be—strictly of their fields means they should keep up with the litera- observed in successful research. Quoting some terms from ture, whether or not it bears directly on their immediate their account of norms, researchers should exhibit “situa- projects. Wilson (1993a) delves into the value of currency. tional familiarity” with the current state of knowledge af- Conventionally, current knowledge is a desirable (and fecting their projects, and “historical familiarity” with spe- sometimes mandatory) part of any researcher’s or profes- cific past studies that led to that state. They should exhibit sional’s social capital. But here again, the norm cannot “expert” situational and historical familiarity with writings withstand scrutiny; as a general notion it is vague; if made by their “direct competitors” or by those doing “parallel specific, the definitions are inconsistent from field to field work.” They should exhibit “working” or at least “nod- and trail off into indeterminacy. In Wilson’s view (636): “A ding” familiarity with writings “ancestral” to theirs, and requirement of keeping up with developments in one’s writings from “donor” fields that have exported theories, profession is not unambiguously a requirement to know methods, or insights to their own. Not so, according to what is going on today that is new, nor a requirement of Wilson and Farid. They conclude that (142) “use of the deep understanding, nor a requirement of an exact scale literature is avoidable in theory and often in practice, ex- [i.e., level of detail] of knowledge, nor a requirement of cept insofar as conventional requirements of scholarship knowledge of every nook and cranny of the profession, prescribe its use. Its use is neither necessary nor sufficient nor is it a requirement to maintain the same level of cur- for acquiring expert situational and historical familiarity rency over all parts of the field for which one is responsi- with the immediate area of one’s work.” In particular, Wil- ble.” Moreover, the cognitive impact of any one current son and Farid devalue the comprehensive literature search, work on any one reader is highly uncertain and could be which supposedly precedes or accompanies any serious low or even zero. Then why read beyond some comforta- scholarly project (cf. Wilson 1983b [1992], 1125-26). Es- ble minimum in the struggle to keep up? 298 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

A decade earlier, Wilson (1983b [1992]) persuasively searchers do not—and cannot—live up to this ideal, be- captured the view of a specialized researcher with regard cause there is simply too much information in literatures to reading. It consists of rationing attention through in- for any one person to absorb. Teams are an improvement, tense self-centeredness (241): “[W]hat others are doing is but they, too, are cognitively too narrow. The proper unit of interest primarily as it affects one’s own work, and what for asking how well the ideal is met is the many-eyed re- doesn’t affect one’s own work can be ignored.” This does search specialty: “It is not how some individual is affected not mean that researchers will lack broad knowledge of but how the specialty as a whole is affected that is in ques- the history and sociology of their fields, but monitoring tion: it is the group as a whole that has to be persuaded publications is not the only way of getting that; another is that the information has an appropriate logical or eviden- from mentors “by osmosis” (246). The goal is to know tial status” (Wilson 1993b, 379). The specialty’s cognitive enough, by whatever means, to succeed in one’s own pro- situation comprises the cognitive situations of all its mem- jects. If exploratory literature searches are needed, Wilson bers, to whom inputs of information may or may not be recommends treating them as “a series of brief raids” situationally relevant. (246), conducted with high cognitive flexibility (244): Inputs can be communications from any source, oral or written, informal or formal. But how can they be evalu- One’s notions of what one is looking for change in ated? Wilson found a model in efficient market theory the process of looking. One’s ideas of possible uses from economics. Empirical studies have shown that mar- change, as one learns more, through successes or kets are efficient in the sense of using all available relevant failures. One thing leads to another, in unforeseeable information in setting prices. Therefore, Wilson corre- ways. And it would often be better to speak of mak- spondingly asks the degree to which members of R&D ing things useful than of finding them useful. One specialties use all available relevant information in doing makes connections, constructs bridges. Spotting po- research. Given how situational relevance is defined, if a tentially useful texts is very much an exercise of im- particular input is not new or would not substantially revise agination and insight. beliefs—revise them, that is, objectively across the spe- cialty—then a specialty’s cognitive system can be called Wilson (1996b) deals with the problem of overload for “adequate” as it stands. solo researchers in the social sciences or humanities as they In Wilson (1993b) he adopts a hypothesis from efficient try to read across specialties or disciplines. They do so in market theory in three forms—(a) weak, (b) semi-strong, the belief that complex social or cultural phenomena can- and (c) strong—and states it both strictly and loosely. Hy- not be adequately addressed in single specialties, because pothetically, the R&D communication system is efficient multiple specialties contribute relevant information. Wil- in that: son distinguishes (193) between “upkeep overload,” caused by the endless stream of new publications in any (a) the current cognitive situation is adequate to that one field, and “task overload,” caused by the volume and specialty’s past productions (Loosely, current opinion variety of materials that a researcher must master in pro- fully reflects all prior work in the same field). jects involving two or more fields. Both result in backlogs (b) the current cognitive situation is adequate to pub- of reading, which in turn force continual prioritizations of lished information produced in any specialty (Loosely, what will be read. If solo researchers try to enter a field current opinion fully reflects all publicly available rel- new to them in mid-career, they incur steep reading costs evant information produced in any field). in time and effort. Some may create idiosyncratic new spe- (c) the current cognitive situation in a specialty is ade- cialties out of prior ones, thereby gaining greater say over quate to all information, whether published or un- which writings are relevant and which are not. But either published, available in any specialty (Loosely, current way, the division of their attention across fields will leave opinion reflects both published and unpublished in- large gaps in what they know. Their attempt to extend the formation available to any worker in any field). range of relevant information, while commendable, can- not enlarge their capacity to read. The only rational way to Wilson (1993b) and its direct continuation, Wilson (1995), draw on relevant information from multiple specialties is develop four lines of argument against the hypothesis, but, to form teams—something many solo researchers may be on balance, conclude that the question of efficient com- loath to do. munication in R&D remains open. That is, the hypothesis These papers share as backdrop a scientific ideal that is not refuted, at least in its weak or semi-strong forms, greatly preoccupied Wilson, which is that, to be rational, because strong evidence for it also exists. In the following researchers must consider all relevant information in doing sketches, Wilson’s arguments against the efficiency hy- their work (Wilson 1993b; 1995; 1996c). Individually, re- pothesis come first; those for it begin with “But,” in- Knowl. Org. 46(2019)No.4 299 H. D. White. Patrick Wilson dented. The first two lines of argument are from Wilson – Deliberate exclusion. Communication in a specialty (1993b); the second two, from Wilson (1995). is not efficient if, for whatever reason, researchers ignore admittedly relevant materials. – Late finds of information. Communication in a But in many specialties, research is routinely specialty is not efficient if researchers repeatedly deemed successful even though countless relevant complain of finding relevant documents too late for items go unread and uncited. There are various jus- them to be of use. Empirical studies have found nu- tifications (Wilson gives six) for bracketing such merous failures of this sort; they also have found un- work—for example, to make a complex problem necessary duplications of research and cases of be- manageable. A broader justification is that, although ing anticipated (i.e., scooped) by other researchers. the ideal of using all work relevant to a project may Almost certainly many more failures along these be rational, it is also impossibly demanding and lines go undetected. hence wholly impractical. But since cognitive differences among individual researchers are irremediable, not all late finds are Wilson (1995) further implies that the LIS systems design- equally significant; there must be a threshold. More- ers’ goal of providing all relevant materials and only rele- over, even if a late find is very significant for a par- vant materials to researchers is defective. In retrieval sys- ticular project, it does not greatly matter unless the tem evaluations, “relevant” documents are defined as failure affects multiple projects across a specialty. matching the query in topic. It is then presumed that the – The Frame Problem. Communication in a specialty more matching documents a system retrieves, the better its is not efficient if it is impossible, even in theory, to performance (as measured by recall). It is also presumed design systems that will bring all “must-read” rele- that the fewer non-matching documents it retrieves for the vant work to members’ attention. This is a concrete same query, the better its performance (as measured by example of the abstract Frame Problem from artifi- precision). This paradigm, which is with us still, is not well cial intelligence—namely, the impossibility of for- suited to the situation of most researchers, which “is more mulating “rules that would specify, given a represen- likely to be a surfeit of relevant information rather than a tation of the world, and given a change in some fea- scarcity” (Wilson 1995, 50). The counter-proposal in the ture of the representation, what other features must same passage is to develop “aids in screening, evaluating, change or at least be reconsidered” (Wilson 1993b, and filtering not just to distinguish relevant from irrele- 379). In general, since no communication system can vant, but to separate dispensable from indispensable rele- reliably recommend all desirable imports or exports vant material.” As a principle of design, Wilson’s “dispen- of information among specialties, late discovery or sable/indispensable” criterion from a generation ago re- non-discovery of relevant information by specialty mains radical. Today’s relevance-ranking techniques as yet members is inevitable. scarcely address it. It encapsulates his constant theme that But items that might be highly relevant to an in- expertly chosen texts in small quantities are what readers dividual or a team are not crucial imports at the spe- need most. cialty level. At that level, the crucial imports are the broad theories or methods that can be used by eve- 11.0 Trustworthy communication ryone. Since members in their entirety monitor vari- ous streams of research, it is likely that some of Whatever that theme is called—e.g., authoritative recom- them will discover and publicize widely applicable mendation or individualized advice—it leads to Wilson work, even if more narrowly relevant items are (1983d), titled Second-Hand Knowledge: An Inquiry into Cogni- missed. tive Authority (2HK). This is the most cited of Wilson’s – Overload. Communication in a specialty is not ef- three books, and the one classified in the Library of Con- ficient if it identifies far more relevant items than gress scheme as epistemology rather than LIS. Wilson members have time to read. A sign of overload is himself called it a work of social epistemology, greatly that researchers’ strategies for managing their read- elaborating on a term coined by another bibliographical ing backlogs always lead to significant omissions. theorist, Margaret Egan (Egan and Shera 1952). But if no one can escape overload, then a sensible When Wilson wrote, philosophers had traditionally compromise must be accepted, which is to combine concerned themselves with first-hand knowledge, gained wide browsing with judicious prioritization. A pro- through direct experience. They had ignored knowledge of ject suffering from too many relevant documents can the extremely common second-hand kind, gained not be redesigned so as to limit required reading. through direct experience but by taking the word of others that something is the case (2HK, 13-26). What others say, 300 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson of course, is not necessarily knowledge (or information in usually refer to beliefs obtained by those means, Wilson the strict sense); what we hear or read must be true or at thought it high time to examine trustworthiness, especially least well-founded. Second-hand knowledge thus involves in writings, as a major qualitative variable. LIS, he points out, questions of truthful and hence trustworthy communica- regularly announces “information” or “knowledge” as its tion. Those most to be trusted in some matter are the au- stock in trade yet fails to emphasize truthfulness—the de- thorities in that matter, and learning who they are, or what fining feature of those terms—in documentary quality con- they have written, is a social pursuit. With some excep- trol (171-179). Although social epistemology has become a tions, such as cult leaders, their authority will be limited to thriving branch of philosophy, quality in this sense is still not hazily defined intellectual spheres (as wide as “all genetics” much addressed in LIS, even though the field takes written or as narrow as “Antiguan stamps”), and, even within those sources of answers as its purview. Given that we want to spheres, the degree to which their words can be trusted read truthful or at least evidence-based accounts of the will vary (19-20). In any case, their authority will suppos- world, what writings can we trust to provide such accounts, edly have been tested and determined over some period by and why should we trust them (165-170)? What roles do in- other persons qualified to do so (26-35). The latter may be stitutions (81-83), professions (131-134), the arts (107-112), the authority’s peers within some activity, or they may be intellectual fashions (57-71), and factions (90-93) play in de- assessors and critics from outside the activity (15): termining authority? Who can advise us on the epistemic status of questions (17-18)—that is, on which questions are Cognitive authority is influence on one’s thoughts closed (answerable by knowledge) and which remain open that one would consciously recognize as proper. The (answerable only by opinion)? These are the sorts of topics weight carried by the words is simply the legitimate Wilson explores (See also Rieh 2002; McKenzie 2003; influence they have. Sundin and Johannisson 2004; Rieh and Danielson 2007). Today, any developed society has a “knowledge indus- People’s influence (21-26) may be justified by their métiers try” (39-46)—that is, a multitude of learned groups (81- (in which specific knowledge is expected) or by their rep- 114) that publish claims of fact and value about the world. utations (among the general public or a particular circle, It also has even more non-learned groups (123-156) that such as one’s friends). They may have said or written some- confidently state what is the case in one matter after an- thing that is intrinsically plausible (if we know enough other. As a rhetorical device, Wilson imagines all such about a topic to infer what is plausible). They may have claims put before a jury that could examine which of them successfully performed tasks relevant to their claims are the best-attested—the most authoritative—and why (which is probably the best test). Finally, they may be be- (83-84). The examination is by no means straightforward, lieved simply because of personal ties (as when a mother nor the results clear-cut. An objection to making trust de- believes her son, whatever he says). pend on truthfulness is that some people—astrologers, for Wilson calls such authority “cognitive” to distinguish it instance—gain reputations as authorities not by being from administrative authority (the power to compel behav- truthful but by communicating what the credulous want to ior). In this context, he also distinguishes between an au- hear (34-35). An objection to making trust depend on per- thority and an expert (26-30). The two terms are usually sonal reputation or achievement is that some writings have taken as synonymous, but, strictly speaking, expertise is deservedly high authority—dictionaries and atlases, for in- oriented toward content, while authority is social in nature. stance—yet their authors are not at all well-known (81, The expert is simply one who knows a lot about some- 169). Wilson takes up many such complications; his book thing; the authority shares that knowledge with others can be understood as an inquiry into the “-worthiness” (The last person on earth could be an expert in survival part of “trustworthy.” His conclusions, as usual, are skep- techniques, yet, lacking an audience, would no longer be an tical; scare-quotes can always be placed around “authority” authority). Ordinarily, of course, authorities with expertise of any kind, but in his view some groups, such as natural will seek to communicate it in a trustworthy way. One test scientists (82-88), are more worthy of trust than others, is whether they can apply their knowledge creatively to such as social scientists (88-94), because, within human new questions (16), which might include advising on cred- cognitive limits, the quality of their evidence is more com- ible writings in their specialties. They can be wrong in their pelling and their predictions are more reliable (cf. Wilson recommendations, but they can also be pragmatically right 1980, 7-8). (31-32), which sets them apart from those without opin- The final chapter of 2HK focuses on libraries as insti- ions in the matter. tutions that conceivably might vet and prioritize written Since most of what people know, or think they know, claims for various readerships. Wilson imagines, as yet an- consists of beliefs acquired from others through hearing or other ideal, a library service that could not only provide reading, and since “information” and “knowledge” in LIS authoritative accounts of what is known in various matters Knowl. Org. 46(2019)No.4 301 H. D. White. Patrick Wilson but could also rule out the possibility that other writings of the literature”) and authority gained by reading the lit- might be even better. At the utopian extreme, library staff erature produced (“a kind that can be acquired without be- would themselves synthesize such authoritative accounts ing a practitioner in the area at all”). While practitioners on behalf of their users (170-171). In real life, of course, and producers know how to conduct and evaluate research they do nothing of the sort; they simply deliver writings in an area, they are not necessarily experts on its literature. whose putative authority is conferred elsewhere—e.g., by By contrast, literature experts know the substance and his- scientific or scholarly associations, publishing houses, and tory of specific works, their intellectual or methodological review journals (165-169). Although the judgments of perspectives, their intertextual relationships, their authors’ these latter institutions may be challenged, libraries are not reputations. Expertise along these latter lines, Wilson as- in the running as a viable alternative. Again, librarians on serts, is sufficient for evaluating research. Literature ex- the whole lack the subject expertise to vet the credibility perts may also act as consultants (Wilson 1991, 263): of texts. Nor is there any societal demand for them to as- sume this role, or desire on their part to do so (176-179). I can ask a person who knows a body of literature Wilson thus retains from his earlier books the theme of well “Is there anything there that I should know librarians’ inadequacy as experts (scholar-librarians ex- about?” and hope that, once I have made it clear cepted). Yet he does not dismiss them entirely. He grants what my own interests and problems are, the other that they do evaluate the cognitive authority of one class will be able to make connections between my situa- of works—the printed or digital reference tools they use tion and the literature of their field and steer me to- in answering brief, closed questions (180-181). This au- ward works that I might otherwise never have heard thority is reflected in the lore that is part of their training of. The crucial ability involved is the ability to see, or in coursework or on the job (For example, they might learn imagine, indirect or nonobvious relevances—i.e., the that Gmelin and Beilstein are trusted databases in, respec- possible utility of works that have no obvious connec- tively, inorganic and organic chemistry). Librarians more- tion at all to my interests, which I would never have over perform tasks ancillary to critical evaluations by oth- found by direct search because it would not have oc- ers, such as building and managing collections, assembling curred to me to search for them. This ability, though requested writings quickly, and advising on the reputations marvelous, is not all that rare. Good librarians have of sources. Those activities facilitate the in-depth evalua- it; graduate students may have it, helping faculty tion of texts by persons whose subject expertise is more members by identifying potentially interesting mate- advanced and specialized. As generalists, librarians may rial in regions unfamiliar to the faculty member. complement the specialists by serving, in a limited way, as “authorities on authorities.” Although unable to judge the Note the upgrade in librarians’ status from being mere aids trustworthiness of most texts themselves, they can provide in 2KoP (see also Wilson 1983b [1992], 246). information relevant to such evaluations—e.g., facts on au- The strong claim in Wilson (1991) is that librarians can thors’ careers, reviews of their work, even their citation teach students to be literature experts through biblio- counts. Wilson writes (180): graphic instruction (BI) courses. Departing uncharacteris- tically from skepticism, he says that BI can enable a student Librarians are in a particularly advantageous position to evaluate the trustworthiness of texts in an area without to survey a wide field, to be at least superficially ac- knowing how to do research in it. For once, he seems too quainted with the work of many different people, optimistic about what librarians can do, because a BI with many books, with many works evaluating and course alone could not give students these powers. In his summarizing the state of knowledge in different account, the BI librarian would choose a specimen litera- fields.***[T]hey are in advantageous positions to de- ture and present students with its “topography”—e.g., its velop a wide familiarity with reputations, with chang- sub-literatures, its bibliographical works, its genres of pub- ing currents of thought, with external signs of suc- lication, its indexing schemes, and its links to other fields cess and failure. Along with knowledge of the stand- (266-267). Students would then do similar topographies in ing of individuals, they can accumulate information research areas of their choice, an approach that prefigures about the standing of particular texts: particularly domain analysis (Hjørland 2002). As such, it seems appro- classics of different fields, standard works, and the priate only for advanced students with specialized subject like. interests. Even so, BI librarians could not assign their stu- dents the readings necessary to master an area’s subject Some years later, Wilson (1991, 263) distinguished be- matter, and only intensive, thoughtful interaction with tween cognitive authority gained by doing research, (“the multiple texts over time could give them cognitive author- kind of authority claimed by practitioners and producers ity as literature experts. It is that interaction, rather than 302 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

“topographical” knowledge, that enables a librarian, a successes and possible improvements. The philosophically graduate student, or anyone else to recommend even ob- unusual inclusion of bibliography in the diagram problem- vious works to others. To recommend useful nonobvious atizes its relation to knowledge, allowing Wilson to discuss, works requires wide reading plus the ability to see hidden for example, how expert consultants complement biblio- relevances on the fly, and creative insight like that cannot graphical instruments. Indeed, old passages of his lend be a goal of BI, because it cannot be taught. themselves to a critique of present-day recommender sys- Similarly, dubious is Wilson’s claim that BI would enable tems, which are attempts to automate the role of consult- students to evaluate the trustworthiness of texts. A student ants through algorithmic operations on bibliographical cannot become a chemist, or referee papers in chemistry, data. Designers of these systems typically judge their suc- by taking a course in the structure of the chemical litera- cess by how well they return documents that resemble ex- ture, nor does a student who domain-analyzes the maser plicit queries. By contrast, Wilsonian consultants recom- literature, thereby, qualify as an authority on masers. As mend documents on the basis of implicit criteria, such as noted in 2HK (51-56), insiders in all learned disciplines trustworthiness, indispensability, and nonobviousness. Im- guard their autonomy as judges of cognitive authority, and plicit criteria are of course a problem for computation, but they would insist that socialization and experience in their Wilson at least prompts designers to explore ways of op- research specialties are necessary to judge the soundness erationalizing them with explicit data (see, e.g., White 2017; of contributions to them—not mere reading knowledge, Jin and Saule 2018). let alone a mere BI course. Wilson (1991, 268) admits as The three levels of the diagram are always simultane- much, but urges outsiders to make personal evaluations ously present in his books. When the interviewer in Wilson that simply flout insiders’ opinions. Granted, outsiders (2000, 134-135) asked him to name his most important with literature expertise sometimes persuasively evaluate publications, he answered [slightly edited]: research in areas not their own—Wilson mentions the lit- erary critic Frederick Crews on Freudian psychology— Oh, those books. Simply because there are three of and, if they can do it, other talented outsiders can do it them, and they fit together. They’re not independent too. But, again, a BI course is neither necessary nor suffi- books in a sense. They’re all facets of a central subject cient for such assessments; at best it would usefully sup- matter. See, from my point of view it turned out that plement extensive reading and personal gifts. leaving philosophy and coming back to librarian- ship—even though I didn’t expect it at the time, I had 12.0 Conclusion no idea at the time that it was going to work out like this—it turned out that the bibliographical core, the It remains to add that such doubts pale in comparison to bibliographical problem, bibliographical center, pro- the achievements of Wilson’s thought. A spare diagram of vided a perfect platform from which to look at every- that thought captures its comprehensiveness and its high thing else in the world. A few overly enthusiastic clas- applicability to KO and LIS: sifiers and catalogers have said in the past, the classi- fier is in charge of all of knowledge. Well, just in the Knowledge and non-knowledge in minds sense that you’re working with a classification system which tries to cover all of knowledge, and so you en- Writings and recorded sayings thusiastically think of yourself as being somehow Bibliography knowledgeable about all of knowledge. Well, in the weakest possible sense you are, I guess. But there is a Reading down, philosophers have always investigated grain of sense to that position. If you start from the knowledge and non-knowledge, but much less often their situation of people saying things about the world, try- problematic representation in writings and recorded say- ing to find out about the world, lying about the world, ings, i.e., in the multitudes of texts external to minds. Still concealing facts about the world, and writing it all less have they investigated problems of intellectual and down, [bibliography] is a good place from which to physical access to those texts through bibliography, i.e., start asking questions about knowledge and the world through formulaic writings about writings. Reading up, the of which it’s knowledge of. So it turned out to be a justification of bibliography is that it gives us certain de- wonderful central position or platform. sirable powers over texts, which in turn give us certain de- sirable powers over knowledge (and valuable non- Wilson’s bibliographical platform is thus a standpoint for knowledge, such as informed opinions or classic fiction). analyzing KO in both librarianship and information sci- A vital matter for study, then, is the nature of these pow- ence. In Wilson (1983c) he sums up the main intellectual ers—particularly their limits and failures as well as their component of LIS as “bibliographical R&D” and calls Knowl. Org. 46(2019)No.4 303 H. D. White. Patrick Wilson

LIS-style information retrieval not a science but a branch Conceptual change has huge consequences for those of engineering. In engineering, solutions to many prob- attempting to organize knowledge for retrieval and lems are reasonably determinate from the outset (Wilson use. Conceptual frameworks get outdated, relevance 1996a, 320). Accordingly, he identified the clearest suc- relations change unpredictably, things fall apart. cesses in information retrieval with the computerization of existing bibliographical data. Attempts at creating biblio- As noted in Wilson (1996b), changes in relevance relations graphical data by computer, such as automating subject place new reading burdens on researchers and impel new classification, he regarded as less successful, probably be- social arrangements for coping, such as collaborative teams cause less determinate. A metaphor in his ASIST Award with members from diverse specialties. But information of Merit speech nicely captures his own role in this regard professionals, too, are affected. Wilson (1983a, 12) foresaw (Wilson 2001a, 11, slightly edited): bibliographical consequences for catalogers. Conceptual change as reflected in new books leads to the creation of I’m no engineer myself … I think of much of my new subject headings. For older books, these new headings work as related to information systems design in the might be better than the headings originally assigned. But way the study of the properties of materials, materials systematic review of older books for possible re-cataloging science, is related to traditional branches of engineer- was not conducted then, nor is it now. As a result, “over time ing … Our version of materials science is to study the amount of misdescribed material is bound to increase, problems in the description of content in terms of the accuracy of the subject catalog declines, the quality is subject matter and form and function and relevance and util- gradually degraded. This is something that automatic proce- ity to find out what can be done easily and what can’t dures cannot eliminate. There is no automatic recognition be done easily or at all. And there is even an analogue process for misdescribed works.” in our field to the strength of materials that plays For his breadth of vision alone, Wilson is inexhaustibly such a big role in older branches of engineering. One re-readable. of the most important properties I’m interested in when I’m looking for arguments or evidence or References proofs or persuasive cases is strength and the ability to bear a lot of weight. Andersen, Jack. 2004. “Analyzing the Role of Knowledge Organization in Scholarly Communication: An Inquiry At the same time, Wilson (1996a, 321-322) champions the into the Foundation of Knowledge Organization.” fact that LIS sprawls beyond engineering into the social, be- PhD diss. Royal School of Library and Information Sci- havioral, and human sciences, including his own brand of ence, Copenhagen. social epistemology, defined as “the social study of Andersen, Jack and Laura Skouvig. 2006. “Knowledge Or- knowledge production and use” (Wilson 2001a, 11). From ganization: A Sociohistorical Analysis and Critique.” Li- his bibliographical platform, writings reflecting knowledge brary Quarterly 76: 300-22. production and use are looked at in terms of bibliographical Bates, Marcia J. 1976. “Rigorous Systematic Bibliography.” consequences. LIS is unique among fields in regarding bib- RQ 16: 7-26. liographical consequences as a central concern, and Wilson Chatman, Elfreda A. 1983. “The Diffusion of Information was exceptionally wide-ranging in what he made of this. among the Working Poor.” PhD diss., University of Cal- Take a hypothetical case, based on his interest in cognitive ifornia, Berkeley. authority. As he implies, although we regularly read to gain Chua, Amy. 2018. “By the Book: Amy Chua.” New York knowledge, not every text delivers it authoritatively; some Times Book Review, February, 4, 8. writers, for instance, lie or conceal facts about the world. Cooper, William S. 1971. “A Definition of Relevance for In- Should we, therefore, try to index writings by their general formation Retrieval.” Information Storage and Retrieval 7: trustworthiness, as is done with monographs in the Human 19-37. Relations Area Files? Could star-ratings be given for degree Coyle, Karen. 2016. FRBR Before and After: A Look at Our of credibility? If not, should we at least link non-fiction Bibliographic Models. Chicago: ALA Editions. books in online library catalogs to their reviews, as is done Egan, Margaret E. and Jesse H. Shera. 1952. “Foundations in Amazon.com? The standard guides for bibliographical of a Theory of Bibliography.” Library Quarterly 22: 125- descriptions of writings are of course mute on this matter, 37. but it is the sort of problem Wilson might have relished. Furner, Jonathan. 2010. “Philosophy and Information Stud- Or consider the problem of rapid conceptual change, ies.” Annual Review of Information Science and Technology 44: which he highlighted in Wilson (2001a, 11): 159-200. 304 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

Garfield, Eugene. 1955 (2006). “Citation Indexes for Sci- Reddy, Michael J. 1979. “The Conduit Metaphor: A Case ence: A New Dimension in Documentation through As- of Frame Conflict in Our Language about Language.” sociation of Ideas.” Science 122, no. 3159: 108-11. [Pagi- In Metaphor and Thought, ed. Andrew Ortony. New York: nation here from reprint in International Journal of Epidemi- Cambridge University Press, 284-324. ology 35: 1123-27.] Rieh, Soo Young. 2002. “Judgment of Information Quality Goldman, Alvin and Thomas Blanchard. 2018. “Social and Cognitive Authority in the Web.” Journal of the Amer- Epistemology.” The Stanford Encyclopedia of Philosophy, ed. ican Society for Information Science and Technology 53: 145-61. Edward N. Zalta. https://plato.stanford.edu/archives/ Rieh, Soo Young and David R. Danielson. 2007. “Credi- sum2018/entries/epistemology-social/ bility: A Multidisciplinary Framework.” Annual Review Goodman, James. 2019. “Freed, but Not Free,” review of of Information Science and Technology 41: 307-64. Separate: The Story of Plessy v. Ferguson, and America’s Jour- Smiraglia, Richard P. 2007. “Two Kinds of Power: Insight into ney from Slavery to Segregation by Steve Luxenberg. New the Legacy of Patrick Wilson.” In Proceedings of the Annual York Times Book Review. February, 24, 11. Conference of CAIS/Actes du congrès annuel de l'ACSI, ed. Hjørland, Birger. 1996. “Overload, Quality and Changing Kimiz Dalkir and Clément Arsenault. https://journals. Conceptual Frameworks.” In Information Science: From the library.ualberta.ca/ojs.cais-acsi.ca/index.php/cais-asci/ Development of the Discipline to Social Interaction, ed. Johan article/view/246/208 Olaisen, Erland Munch-Petersen and Patrick Wilson. Smiraglia, Richard P. 2014. “Wilson” in The Elements of Oslo: Scandinavian University Press, 35-67. Knowledge Organization, 9-12. Cham: Springer Interna- Hjørland, Birger. 2001. “Towards a Theory of Aboutness, tional. Subject, Topicality, Theme, Domain, Field, Content … Sundin, Olof and Jenny Johannisson. 2004. “Pragmatism, and Relevance.” Journal of the American Society for Infor- Neo-Pragmatism, and Sociocultural Theory: Commu- mation Science and Technology 52: 774-8. nicative Participation as a Perspective in LIS.” Journal of Hjørland, Birger. 2002. “Domain Analysis in Information Documentation 61: 23-43. Science: Eleven Approaches; Traditional as well as In- Svenonius, Elaine. 2000. The Intellectual Foundation of Infor- novative.” Journal of Documentation 58: 422-462 mation Organization. Cambridge, MA: MIT Press. Hjørland, Birger. 2003. “Fundamentals of Knowledge Or- Swanson, Don R. 1980. “Libraries and the Growth of ganization.” Knowledge Organization 30: 87-111. Knowledge.” Library Quarterly 50: 112-34. Hjørland, Birger. 2008. “What is Knowledge Organiza- Swanson, Don R. 1986. “Undiscovered Public Knowledge.” tion?” Knowledge Organization 35: 86-101. Library Quarterly 56: 103-18. Hjørland, Birger. 2016. “Knowledge Organization.” Know- White, Howard D. 1992. “Publication and Bibliographic ledge Organization 43: 475-84. Statements.” In For Information Specialists: Interpretations of Hodges, Theodora L. 1972. “Citation Indexing: Its Poten- Reference and Bibliographic Work, by Howard D. White, Mar- tial for Bibliographical Control.” PhD diss., University cia J. Bates and Patrick Wilson. Norwood, NJ: Ablex, 81- of California, Berkeley. 116. Jin, Haofeng and Erik Saule. 2018. “Toward Finding Non- White, Howard D. 2017. “Bag of Works Retrieval: Obvious Papers: An Analysis of Citation Recom- TF*IDF Weighting of Works Co-Cited with a Seed.” mender Systems.” https://arxiv.org/pdf/1812.11252 International Journal of Digital Libraries 19: 139-49. Joudrey, Daniel N. and Arlene G. Taylor. 2018. The Organi- White, Howard D., Marcia J. Bates and Patrick Wilson. zation of Information. 4th ed. Santa Barbara, CA: Libraries 1992. For Information Specialists: Interpretations of Reference Unlimited. and Bibliographic Work. Norwood, NJ: Ablex. McKenzie, Pamela J. 2003. “Justifying Cognitive Authority Wilson, Patrick. 1956. Government and Politics of India and Decisions: Discursive Strategies of Information Seek- Pakistan, 1885-1955: A Bibliography of Works in ers.” Library Quarterly 73: 261-88. Languages. Modern India Project. Bibliographical Study Munch-Petersen, Erland. 1996. “Patrick Wilson and the 2. Berkeley, CA: South Asia Studies, Institute of East Classics.” In Information Science: From the Development of the Asiatic Studies, University of California. Discipline to Social Interaction, ed. Johan Olaisen, Erland Wilson, Patrick. 1957a. South Asia: A Selected Bibliography on Munch-Petersen and Patrick Wilson. Oslo: Scandina- India, Pakistan, Ceylon. New York: American Institute of vian University Press. 233-43. Pacific Relations. Olaisen, Johan, Erland Munch-Petersen and Patrick Wil- Wilson, Patrick. 1957b. A Checklist of the Writings of M. N. son, eds. 1996. Information Science: From the Development of Roy. Modern India Project. Bibliographical Study 1. the Discipline to Social Interaction. Oslo: Scandinavian Uni- Berkeley, CA: South Asia Studies, Institute of East Asi- versity Press. atic Studies, University of California. Knowl. Org. 46(2019)No.4 305 H. D. White. Patrick Wilson

Wilson, Patrick. 1960a. “On Interpretation and Under- Wilson, Patrick. 1991. “Bibliographic Instruction and Cog- standing.” PhD diss., University of California, Berkeley. nitive Authority.” Library Trends 39: 259-70. Wilson, Patrick. 1960b. “Austin on Knowing.” Inquiry: An Wilson, Patrick. 1992. “Searching: Strategies and Evalua- Interdisciplinary Journal of Philosophy 3: 49-60. tion.” In For Information Specialists: Interpretations of Refer- Wilson, Patrick. 1965. “Quine on Translation.” Inquiry: An ence and Bibliographic Work, by Howard D. White, Marcia Interdisciplinary Journal of Philosophy 8: 198-211. J. Bates and Patrick Wilson. Norwood, NJ: Ablex, 153- Wilson, Patrick. 1966a. “The Need to Justify.” The Monist 81. 50: 267-80. Wilson, Patrick. 1993a. “The Value of Currency.” Library Wilson, Patrick. 1966b. Science in South Asia, Past and Present: Trends, 41: 632-43. A Preliminary Bibliography. New York: Foreign Area Ma- Wilson, Patrick. 1993b. “Communication Efficiency in Re- terials Center, University of the State of New York. search and Development.” Journal of the American Society Wilson, Patrick. 1968. Two Kinds of Power: An Essay on Bib- for Information Science 44: 376-82. liographical Control. Berkeley: University of California Wilson, Patrick. 1995. “Unused Relevant Information in Press. Research and Development.” Journal of the American So- Wilson, Patrick. 1973. “Situational Relevance.” Information ciety for Information Science 46: 45-51. Storage and Retrieval 9: 457-71. Wilson, Patrick. 1996a. “The Future of Research in Our Wilson, Patrick. 1977. Public Knowledge, Private Ignorance: To- Field.” In Information Science: From the Development of the ward a Library and Information Policy. Westport CT: Green- Discipline to Social Interaction, ed. Johan Olaisen, Erland wood Press. Munch-Petersen and Patrick Wilson. Oslo: Scandina- Wilson, Patrick. 1978. “Some Fundamental Concepts of In- vian University Press, 319-23. formation Retrieval.” Drexel Library Quarterly 14, no. 2: Wilson, Patrick. 1996b. “Interdisciplinary Research and In- 10-24. formation Overload.” Library Trends 45: 192-203. Wilson, Patrick. 1979a. “The End of Specificity.” Library Wilson, Patrick. 1996c. “Some Consequences of Infor- Resources & Technical Services 23: 116-22. mation Overload and Rapid Conceptual Change.” In In- Wilson, Patrick. 1979b. “Utility-Theoretic Indexing.” Jour- formation Science: From the Development of the Discipline to nal of the American Society for Information Science 30: 169- Social Interaction, ed. Johan Olaisen, Erland Munch-Pe- 170. tersen and Patrick Wilson. Oslo: Scandinavian Univer- Wilson, Patrick. 1980. “Limits to the Growth of sity Press, 21-34. Knowledge: The Case of the Social and Behavioral Sci- Wilson, Patrick. 1998. “Patrick Wilson: A Bibliographer ences.” Library Quarterly 50: 4-21. among Catalogers.” Cataloging & Classification Quarterly Wilson, Patrick. 1983a. “The Catalog as Access - 25: 305-16. nism: Background and Concepts.” Library Resources & Wilson, Patrick. 2000. Patrick G. Wilson, Philosopher of Infor- Technical Services 27: 4-17. [Pagination here from the re- mation: An Eclectic Imprint on Berkeley's School of Librarian- print in White, Bates, and Wilson, 230-46.] ship, 1965-1991. Interviewed by Laura McCreery. Intro- Wilson, Patrick. 1983b (1992). “Pragmatic Bibliography.” duction by Howard D. White. Library School Oral His- In Back to the Books: Bibliographic Instruction and the Theory tory Series and University of California, Source of Com- of Information Sources, ed. Ross Atkinson. Chicago: Asso- munity Leaders Series, Regional Oral History Office, The ciation of Research Libraries, 5-15. Bancroft Library, University of California, Berkeley. Wilson, Patrick. 1983c. “Bibliographical R&D.” In The https://oac.cdlib.org/ark:/13030/kt958006vr/?brand= Study of Information: Interdisciplinary Messages, ed. Fritz oac4 Machlup and Una Mansfield. New York: Wiley. 389-97. Wilson, Patrick. 2001a. “On Accepting the ASIST Award Wilson, Patrick. 1983d. Second-Hand Knowledge: An Inquiry of Merit.” Bulletin of the American Society for Information into Cognitive Authority. Westport, CT: Greenwood Press. Science and Technology 28, no. 2: 10-11. Wilson, Patrick. 1986. “The Face Value Rule in Reference Wilson, Patrick. 2001b. Review of The Intellectual Foundation Work.” RQ 25: 468-75. of Information Organization, by Elaine Svenonius. College Wilson, Patrick. 1989a. “The Second Objective.” In The & Research Libraries 62: 203-4. Conceptual Foundations of Descriptive Cataloging, ed. Elaine Wilson, Patrick and Mona Farid. 1979. “On the Use of the Svenonius. San Diego, CA: Academic Press, 5-16. Records of Research.” Library Quarterly 49: 127-45. Wilson, Patrick. 1989b. “Interpreting the Second Objec- Wilson, Patrick and Nick Robinson. 1990. “Form Subdivi- tive of the Catalog.” Library Quarterly 59: 339-53. sions and Genre.” Library Resources & Technical Services Wilson, Patrick. 1990. “Copyright, Derivative Rights, and 34: 36-43. the First Amendment.” Library Trends 39: 92-110. 306 Knowl. Org. 46(2019)No.4 H. D. White. Patrick Wilson

Yee, Martha M. 1995. “What Is a Work? Part 4. Cataloging Wilson, Patrick. 1980. Review of The Vital Network: A The- Theorists and a Definition.” Cataloging & Classification ory of Communication and Society by Patrick Williams and Quarterly 20, no. 2: 3-23. Joan T. Pearce. Journal of Academic Librarianship 5: 351. Zeng, Marcia Lei. 2008. “Knowledge Organization Sys- Wilson, Patrick. 1981. Review of Information, Organization, tems.” Knowledge Organization 35: 160-82. and Power: Effective Management in the Knowledge Society, by Dale E. Zand. Journal of Academic Librarianship 7: 295. Appendix: Wilson’s Oeuvre Wilson, Patrick. 1983. “The Catalog as Access Mechanism: Background and Concepts.” Library Resources & Tech- Wilson, Patrick. 1956. Government and Politics of India and nical Services 27: 4-17. Pakistan, 1885-1955: A Bibliography of Works in Western Wilson, Patrick. 1983. “Pragmatic Bibliography.” In Back Languages. South Asia Studies, Institute of East Asiatic to the Books: Bibliographic Instruction and the Theory of Infor- Studies, University of California, Berkeley. mation Sources, ed. Ross Atkinson. Chicago: Association Wilson, Patrick. 1957. South Asia: A Selected Bibliography on of Research Libraries. 5-15. [Reprinted in For Information India, Pakistan, Ceylon. New York: American Institute of Specialists: Interpretations of Reference and Bibliographic Work, Pacific Relations. by Howard D. White, Marcia J. Bates and Patrick Wil- Wilson, Patrick. 1957. A Checklist of the Writings of M. N. son. Norwood, NJ: Ablex. 230-246.] Roy. South Asia Studies, Institute of East Asiatic Stud- Wilson, Patrick. 1983. “Bibliographical R&D.” In The Study ies, University of California, Berkeley. of Information: Interdisciplinary Messages, ed. Fritz Machlup Wilson, Patrick. 1960. "On Interpretation and Under- and Una Mansfield. New York: Wiley. 389-97. standing." PhD diss., University of California, Berkeley. Wilson, Patrick. 1983. Second-Hand Knowledge: An Inquiry into Wilson, Patrick. 1960. “Austin on Knowing.” Inquiry: An Cognitive Authority. Westport, CT: Greenwood Press. Interdisciplinary Journal of Philosophy 3: 49-60. Wilson, Patrick. 1983. Review of Knowledge and the Flow of Wilson, Patrick. 1965. “Quine on Translation.” Inquiry: An Information, by Fred I. Dretske. Information Processing & Interdisciplinary Journal of Philosophy 8: 198-211. Management 19: 61-2. Wilson, Patrick. 1966. “The Need to Justify.” The Monist 50, Wilson, Patrick. 1983. Review of Reading Research and Li- no. 2: 267-80. brarianship: A History and Analysis, by Stephen Karetzky. Wilson, Patrick. 1966. Science in South Asia, Past and Present: Journal of Academic Librarianship 8: 361. A Preliminary Bibliography. New York: Foreign Area Ma- Wilson, Patrick. 1984. Review of The Subject in the Dictionary terials Center, University of the State of New York. Catalog from Cutter to the Present, by Francis Miksa. Library Wilson, Patrick. 1968. Two Kinds of Power: An Essay on Bib- Quarterly 54: 109-10. liographical Control. Berkeley: University of California Wilson, Patrick. 1985. Review of Knowledge Structure and Press. Use: Implications for Synthesis and Interpretation, ed. Spencer Wilson, Patrick. 1973. “Situational Relevance.” Information A. Ward and Linda J. Reed. Information Processing & Man- Storage and Retrieval 9: 457-71. agement 21: 370. Wilson, Patrick. 1977. Public Knowledge, Private Ignorance: To- Wilson, Patrick. 1986. “The Face Value Rule in Reference ward a Library and Information Policy. Westport CT: Green- Work.” RQ 25: 468-75. wood Press. Wilson, Patrick. 1989. “The Second Objective.” In The Wilson, Patrick. 1978. “Some Fundamental Concepts of Conceptual Foundations of Descriptive Cataloging, ed. Elaine Information Retrieval.” Drexel Library Quarterly 14, no. Svenonius. San Diego, CA: Academic Press. 5-16. 2: 10-24. Wilson, Patrick. 1989. “Interpreting the Second Objective Wilson, Patrick. 1978. Review of Libraries in Post-Industrial of the Catalog.” Library Quarterly 59: 339-53. Society, ed. Leigh Estabrook. Journal of Academic Librari- Wilson, Patrick. 1990. “Copyright, Derivative Rights, and anship 4: 95-6. the First Amendment.” Library Trends 39: 92-110. Wilson, Patrick. 1979. “The End of Specificity.” Library Wilson, Patrick. 1991. “Bibliographic Instruction and Cog- Resources & Technical Services 23: 116-22. nitive Authority.” Library Trends 39: 259-70. Wilson, Patrick. 1979. “Utility-Theoretic Indexing.” Journal Wilson, Patrick. 1991. Review of Envisioning Information and of the American Society for Information Science 30: 169-70. The Visual Display of Quantitative Information, by Edward Wilson, Patrick. 1980. “Limits to the Growth of R. Tufte. College & Research Libraries 52: 382-3. Knowledge: The Case of the Social and Behavioral Sci- Wilson, Patrick. 1992. “Searching: Strategies and Evalua- ences.” Library Quarterly 50: 4-21. tion.” In For Information Specialists: Interpretations of Refer- Wilson, Patrick. 1980. Review of New Perspectives for Refer- ence and Bibliographic Work, by Howard D. White, Marcia ence Service in Academic Libraries by Raymond G. McInnis. J. Bates and Patrick Wilson. Norwood, NJ: Ablex. 153- Library Quarterly 50: 263-4. 81. Knowl. Org. 46(2019)No.4 307 H. D. White. Patrick Wilson

Wilson, Patrick. 1992. Review of History and Communica- Wilson, Patrick. 1996. Review of Theories of the Information tions: Harold Innis, Marshall McLuhan, the Interpretation of Society, by Frank Webster. College & Research Libraries 57: History, by Graeme Patterson. Library Quarterly, 62, no. 487-9. 2: 232-3. Wilson, Patrick. 1997. Review of The Global Information So- Wilson, Patrick. 1993. “The Value of Currency.” Library ciety, by William J. Martin. Journal of Academic Librarian- Trends 41: 632-43. ship 23: 145-6. Wilson, Patrick. 1993. “Communication Efficiency in Re- Wilson, Patrick. 1997. Review of Cognition and Complexity: search and Development.” Journal of the American Society The Cognitive Science of Managing Complexity, by Wayne W. for Information Science 44: 376-82. Reeves. Journal of Education for Library and Information Sci- Wilson, Patrick. 1993. Review of Change and Challenge in Li- ence 38: 232-3. brary and Information Science Education, by Margaret F. Wilson, Patrick. 1998. “Patrick Wilson: A Bibliographer Stieg. College & Research Libraries 54: 275-6. among Catalogers.” Cataloging & Classification Quarterly 25: Wilson, Patrick. 1993. Review of Dilemmas in the Study of 305-16. Information: Exploring the Boundaries of Information Science, Wilson, Patrick. 1998. Cognition and Complexity - Response. by S. D. Neill. Journal of Education for Library and Infor- Journal of Education for Library and Information Science 39: mation Science 34: 90-1. 155-6. Wilson, Patrick. 1994. Review of The Barefoot Expert: The Wilson, Patrick. 1998. Review of Information Seeking and Interface of Computerized Knowledge Systems and Indigenous Subject Representation: An Activity-Theoretical Approach to Knowledge Systems, by Doris M. Schoenhoff. Journal of the Information Science, by Birger Hjørland. College & Research American Society for Information Science 45: 220-1. Libraries 59: 287-8. Wilson, Patrick. 1994. Review of The Metaphysics of Virtual Wilson, Patrick. 1998. Review of The Scientific Revolution, by Reality, by Michael Heim. College & Research Libraries 55: Steven Shapin. Library Quarterly 68: 102-3. 87-8. Wilson, Patrick. 2000. Patrick G. Wilson, Philosopher of Infor- Wilson, Patrick. 1995. “Unused Relevant Information in mation: An Eclectic Imprint on Berkeley's School of Librarian- Research and Development.” Journal of the American So- ship, 1965-1991. Interviewed by Laura McCreery. Intro- ciety for Information Science 46: 45-51. duction by Howard D. White. Library School Oral His- Wilson, Patrick. 1995. Review of Knowledge-Based Systems for tory Series and University of California, Source of Com- General Reference Work: Applications, Problems, and Progress, munity Leaders Series, Regional Oral History Office, The by John V. Richardson. Journal of the American Society for Bancroft Library, University of California, Berkeley. Information Science 46: 792-793. https://oac.cdlib.org/ark:/13030/kt958006vr/?brand= Wilson, Patrick. 1995. Review of Thinking Through Technol- oac4 ogy: The Path between Engineering and Philosophy, by Carl Wilson, Patrick. 2000. Review of The Information of the Im- Mitcham. College & Research Libraries 56: 184-186. age, by Allan D. Pratt. Library Quarterly 70: 135-7. Wilson, Patrick. 1996. “The Future of Research in Our Wilson, Patrick. 2001. “On Accepting the ASIST Award Field.” In Information Science: From the Development of the of Merit.” Bulletin of the American Society for Information Discipline to Social Interaction, ed. Johan Olaisen, Erland Science and Technology 28, no. 2: 10-11. Munch-Petersen, and Patrick Wilson. Oslo: Scandina- Wilson, Patrick. 2001. Review of The Intellectual Foundation vian University Press. 319-23. of Information Organization, by Elaine Svenonius. College Wilson, Patrick. 1996. “Interdisciplinary Research and In- & Research Libraries 62: 203-4. formation Overload.” Library Trends 45: 192-203. Wilson, Patrick and Mona Farid. 1979. “On the Use of the Wilson, Patrick. 1996. “Some Consequences of Infor- Records of Research.” Library Quarterly 49: 127-45. mation Overload and Rapid Conceptual Change.” In In- Wilson, Patrick and Nick Robinson. 1990. “Form Subdivi- formation Science: From the Development of the Discipline to So- sions and Genre.” Library Resources & Technical Services cial Interaction, ed. Johan Olaisen, Erland Munch-Petersen 34: 36-43. and Patrick Wilson. Oslo: Scandinavian University Press. 21-34. Wilson, Patrick. 1996. Review of The Closing of American Library Schools: Problems and Opportunities, by Larry J. Ost- ler, Therrin C. Dahlin and J. D. Wallardson. College & Research Libraries 57: 197-8.

308 Knowl. Org. 46(2019)No.4 R. Smiraglia. Work

Work† Richard P. Smiraglia Institute for Knowledge Organization and Structure, Inc., Shorewood WI 53211, USA,

Richard P. Smiraglia holds a PhD in information from the University of Chicago. He is Senior Fellow and Ex- ecutive Director of the Institute for Knowledge Organization and Structure, Inc. and is Editor-in-Chief of this journal. He also is Professor Emeritus of the iSchool at the University of Wisconsin-Milwaukee. He was 2017- 2018 KNAW Visiting Professor at DANS (Data Archiving and Networked Services division of the Royal Neth- erlands Academy of the Arts and Sciences), The Hague, The Netherlands, where he remains visiting fellow and was the 2018 recipient of the 2018 Frederick G. Kilgour Award for Research in Library and Information Tech- nology.

Smiraglia, Richard P. 2019. “Work.” Knowledge Organization 46(4): 308-319. 58 references. DOI:10.5771/0943- 7444-2019-4-308.

Abstract: A work is a deliberately created informing entity intended for communication. A work consists of abstract intellectual content that is distinct from any object that is its carrier. In library and information science, the importance of the work lies squarely with the problem of information retrieval. Works are mentefacts—intellectual (or mental) constructs that serve as artifacts of the cultures in which they arise. The meaning of a work is abstract at every level, from its creator’s conception of it, to its reception and inherence by its consumers. Works are a kind of informing object and are subject to the phenomenon of instantiation, or realization over time. Research has indicated a base typology of instantiation. The problem for information retrieval is to simultaneously collocate and disambiguate large sets of instantiations. Cataloging and bibliographc tradition stipulate an alphabetico-classed arrangement of works based on an authorship principle. FRBR provided an entity-relationship schema for enhanced control of works in future catalogs, which has been incorporated into RDA. FRBRoo provides an empirically more precise model of work entities as informing objects and a schema for their representation in knowledge organization systems.

Received: 21 September 2018; Accepted: 22 September 2018

Keywords: works, mentefacts, catalogs, disambiguation, collocation, FRBR, FRBRoo, information retrieval, semiotic, superworks, canonicity

† Derived from the article of similar title in the ISKO Encyclopedia of Knowledge Organization, Version 1.0; published 2018-09-18, last edited 2019-05-02 . Article category: Core concepts in KO

1.0 Introduction these works for which many iterations might come to be represented together in information retrieval systems. In A work is the essence of a creation, such as a novel, a sym- culture at large, some works serve iconic roles—think of phony, a painting, a statue, a thesis, etc., intended by its Mona Lisa (Da Vinci) or Eroica Symphony (Beethoven) or Iliad creator to be communicated with some audience. More (Homer), or even Gateway Arch (Eero Saarinen) for exam- formally, a work is a deliberately created informing entity ples associated with individual creators, or Bible or Kama Su- intended for communication. Known variously in human- tra for examples of works that are not associated with spe- istic disciplines as work, oeuvre, opus, etc., a work can be cific individual creators. These works, which serve some- a critical entity for ordering and retrieval in bibliographic how iconic roles, can be viewed as semiotic entities—signs, information systems such as catalogs or indexes. A work in other words—imbued with various cultural reputations consists of abstract intellectual content that is distinct from that might extend beyond their intellectual content. any object that is its carrier. This distinction between the In this article we will consider the importance of distin- work and the item that carries it is critical for information guishing between works and items and we will look briefly retrieval, because the attributes of works are those of their at the history of the treatment of works in information re- abstract intellectual content, whereas the attributes of trieval. We will consider carefully the nature of works in- items are those of (usually physical) informing objects. cluding their cultural meaning. We will look at the phe- Works need not be literary or even textual. Although nomenon of instantiation that underlies the evolution of most libraries contain mostly books, a wide variety of other sets of works derived from a common progenitor, and we means of expression are represented in information retrieval will consider the major conceptual schema currently in use systems. Certain works become very well known, and it is for representing works. Knowl. Org. 46(2019)No.4 309 R. Smiraglia. Work

2.0 The importance of works for information 3.0 A brief history of the treatment of works in retrieval information retrieval

The concept of the work is of great interest to scholars in We do not have space here to rehearse the entire history the humanities, most notably in literary criticism, philoso- either of the catalog or of information retrieval. But, it is phy, and musicology, as well as in the interdisciplinary ex- important to understand that in the development of the ercise of textual criticism that crosses the boundaries of library catalog the movement from “inventory of books” bibliography and literary criticism. Several authors have to “device for indexing works” has been a long haul. One called for a theory of the work, notably Foucault (1984) must appeal, of course, to Strout’s (1956) famous history and Tanselle (1989), and others have explored the cultural of the catalog. And there one will learn that most early at- meaning of works, such as Talbot (2000) and Goehr tempts to create catalogs look to our eyes like inventories. (1992). But in library and information science the im- One will also learn that the likely crux that caused greater portance of the work lies squarely with the problem of in- sophistication in the construction of catalogs (things like formation retrieval. It is widely accepted that information entry of works under author names, subject gatherings, retrieval systems should collocate the works of a particular and so forth) were introduced not by librarians but by author, and furthermore, that within that collocated list the booksellers. This should not be surprising. In a time when iterations—translations, editions, etc.—of a particular few were literate the librarian likely had a good grasp of the work should likewise be collocated. But the problem books under his charge. It was only when it became im- arises, that individual publications, from which indexed ci- portant to sell books, and various diverse iterations of tations are harvested by transcription, do not necessarily books, that catalogs required ever greater sophistication. have identical identifying marks. That is to say, under Nevertheless, by the Enlightenment, the work had be- Charles Dickens, entries for editions of books with the title gun to receive the attention of librarians such as Thomas Bleak House will file together, but translated editions with Hyde, and by the mid-nineteenth century, Anthony titles Hokumeikan monogatari and Maison d'Ăpre-Vent will Panizzi. These famous librarians (and engineers of cata- not collocate. So a convention is required to cause all of logs, and they were engineers) had to concern themselves the iterations of Bleak House to be filed together in the sys- with the issue of disambiguation. The famous hearings be- tem, as well as to keep them distinct from the other works fore Commons in which Panizzi ([1848] 1985) defended by Charles Dickens. The uniform title was used to good his catalog structure made that clear. The reader was not effect for this purpose in Anglo-American and other cata- so much interested in a particular book, Panizzi asserted, loging traditions through most of the twentieth century, as in a particular work, no matter in what book it appeared. recently renamed the preferred title. By the twentieth century the issue had become critical for However, a second problem arises, and that is the sub- modern librarianship. Eva Verona (1985) wrote about the sequent problem of disambiguation in a file of apparently notion of the literary unit, which was to be a collocating identical entries. What do we do with a list of hundreds of device for a work. The reason—that a nascent information English-language editions of Bleak House? Typical solu- explosion had begun to present librarians and the reading tions are to sub-file by publisher or date, but that does not public with many diverse editions, translations, and com- necessarily sort the intellectual content in ways that might mentaries of the most sought works. be appropriate for retrieval. The same is true for illustra- The catalog was originally structured as an inventory of tions of Saarinen’s Gateway Arch—none of them is the books. Describe this “item” by transcribing its title page. arch, and all of them are representations of the arch, but Then disambiguate the description as need be to make it all of them are also different from each other. So, disam- serve several filing masters by adding subject headings, au- biguation of large retrieval clusters that might otherwise thor headings, and so forth. The uniform title was added appear identical requires understanding the nuances of the in rare instances to collocate editions of canonical works iterations of works, and the resulting instantiations that for which a library might have dozens of iterations. Still, might all be present in such a cluster. “Instantiation,” then, however, the work itself remained without operational def- is the phenomenon associated with works (and also with inition. The second edition of the Anglo-American Catalogu- all other informing objects) that describes the patterns of ing Rules (AACR2 1988) led to change by presenting a iteration over time that result in large, ambiguous clusters modern approach to information retrieval requiring the in retrieval systems. What has been required is a means of use of uniform titles for works that had appeared under separating the work entity from other entities such as doc- different titles. This offered a superstructure of works—a uments (Buckland 2018), sources or information objects in set of alphabetico-classified solutions for organizing (and information retrieval systems (Smiraglia 2002a). disambiguating) large files of works (see below).

310 Knowl. Org. 46(2019)No.4 R. Smiraglia. Work

In the latter decades of the twentieth century, research Therefore in our construct the metaphorical biblio- began to provide empirical evidence of the extent of the graphical universes are populated by entities—know- panoply of iterations of works that might require simulta- able elements of reality—that can be seen to exist in neous collocation and disambiguation for information re- relationship to each other—relationships of nearness trieval. Studies were conducted using random samples of and distance, of joint motion, of evolution over time, works gathered from various bibliographical sources, in- etc. Smiraglia (1996) extended the metaphor logically cluding academic libraries, bibliographic utilities, music by suggesting that works and their instantiations clus- and motion picture libraries, and canonical lists of works ter in metaphorical constellations, having orbital and (Smiraglia 1992; Yee 1993; Vellucci 1997; Smiraglia 2001, therefore gravitational relationship to each other, and 2007b; Petek 2007). User studies suggested more natural that there are different sorts of celestial bodies in the groupings of works could be achieved in the catalog (Car- bibliographical universe. These “constellations” are lyle 1999). Major results indicated that large proportions of groupings of instantiations of works—not only the the works that make up library collections exist in multiple progenitor work itself, but also its editions, transla- iterations, that cultural phenomena seem to play a role in tions, abridgments, adaptations, excerpts, etc., and determining which works will do so, and that clustering their instantiations as well. These have been termed works by meaningful identifiers would be most useful. variously “bibliographic families” (Wilson), “super- works” (Svenonius 2001), “textual identity networks” 4.0 The nature of works (Leazer and Furner 1999), and “instantiation net- works” (Smiraglia 2008). 4.1 Cultural meaning Curatorship demands understanding and elucidation of Works are mentefacts (Gnoli 2018). That is, they are intel- the ineluctable qualities of these mentefacts. Thus, a music lectual (or mental) constructs, but as such, they serve as librarian must know not only Beethoven or his Eroica sym- artifacts of the cultures in which they arise. The meaning phony but also the story of the Napoleonic wars, the history of a work is abstract at every level, from its creator’s con- of the Hapsburgs, the cultural and scientific evolution of ception of it, to its reception and inherence by its consum- the symphony as an icon of western musical sophistica- ers. Various semiotic theories have been used to describe tion, the history of the rise and fall of the symphony or- this phenomenon. In Saussurian terms, a work at the time chestra, the appreciation of Beethoven and his rise to cult of its origin is fixed in its creator’s intellect and is theoret- status, and so on. Placing the editions, Ur-texts, scores and ically, if only for a moment, immutable. But once the work parts, recordings, and other iterations of this “work” re- has been offered to the public it is overtaken by its recep- quires multidisciplinary knowledge. Such is the task of the tion, and it becomes infinitely mutable. Furthermore, such cataloger and this is but a single example. works then function along the lines of Peircian symbols— Day (2008, 44-45) drew a distinction between literary they have cultural meaning in a broad sense that is deter- works and works of art, suggesting that works should be mined both collectively and individually. Thus, a popular viewed as “events that are constitutive of meaning by vir- work becomes the property of those for whom it is popu- tue of their negotiation of cultural and social horizons lar. What is Mona Lisa? A painting? Yes, but much more; it through material forms and techniques.” Works of art is a human mystery, a mysterious woman, a sign of the hu- (sculpture, painting, etc.) are seen as the result of “work” man condition—these and many other interpretations (in the sense of labor expended) that results in a physical keep the painting alive in the consciousness of millions object that may be understood as site-specific and time- who have never seen the painting and millions more who valued. Drawing at once on Heidegger and the Visual Re- have never even seen its likeness. Yet they all know what a sources Association VRA Core standards for description “Mona Lisa” is—it has iconic semiotic status (that is, its of visual works,1 Day situates works of art and literary role in culture is larger than life). This is the status works works in different epistemic traditions, such that literary may achieve. And once they do, the job of the library-and- works that might begin as mentefacts occupy an epistemic information science community is to curate them. zone in which the transference of their content (i.e., in- Works are said to be entities of what Patrick Wilson stantiation) is seen as a metaphysical property of their so- ([1968] 1978) called the bibliographic universe, a sort of cial position, but works of art occupy an epistemic zone in concept-space in which all recorded knowledge exists, rep- which their technological realization as objects is post- resented by specific texts, all in relationship with each metaphysical. In this scenario the re-presentation of a other. This metaphor has been extended to show the role work of art does not belong necessarily to the same cate- of works in knowledge organization (Smiraglia and van gorically-bounded set as the work itself, whereas the liter- den Heuvel 2013, 373): ary work and its re-presentations are all members of the Knowl. Org. 46(2019)No.4 311 R. Smiraglia. Work same set. Day reports that the VRA Core standard distin- time the evolution of a work as its semantic content guishes works of art by attributes of entities such as time changes—such iterations have been termed “derivations” period of creation, location of discovery, and current cu- (Smiraglia 2002b). In other cases, works might be adapted ration. Similarly, the CIDOC Conceptual Reference Model for reuse in children’s versions, as screenplays, as librettos, (CRM)(http://www.cidoc-crm.org/), which is a multi- etc. In these cases, it is possible to trace the evolution of discplinary international meta-level ontology for cultural the alteration of the ideational content—such iterations heritage information sharing, approaches all works (art or have been termed “mutations.” otherwise) as results of the events that brought them into Works are a kind of informing object, alongside more being as well as those associated with their persistence usually physical entities, such as documents (Buckland across time. 2018), or archival records or naturally occurring artifacts. Works also are ontological realities, which makes them Like all such informing objects, then, works are subject to objects for knowledge organization. A clear example is the the phenomenon of instantiation. The term “instantiation” manner in which representations of instantiations of describes the phenomenon of realization over time (Smi- works are gathered in information retrieval systems under raglia 2008). We have learned that the majority of works name-title appellations—kinds of nominal historicist epis- exist only in their original instantiation, but that significant temological anchors—but then subdivided in detailed numbers (likely around one-third in the bibliographic uni- schema based on mutation of the work’s ideation or it’s verse but one-half or more in library collections) exist in actual expression in text. Works are vehicles for commu- multiple instantiations. For every two or three one-off nication (Smiraglia 2001, 57), which also means that they books there is a work like Steinbeck’s Grapes of Wrath that are social entities shaped by culture. Because works are exists in hundreds of editions, and translations, and be- core in every part of human experience—from comes a screenplay, a motion picture, and so on. The first- sacred texts to legal foundations to iconic structures to published iteration of a work in such a set has been termed iconic novels—they have been studied as constructs in the “progenitor,” and we know that older progenitors are many disciplines.2 Philosophy is the discipline that most associated with larger networks of instantiations, but more closely touches knowledge organization, and in that field recent progenitors are associated with more complex net- many voices combine to extol the meaning of a text even works of instantiations. That is, works that originated cen- as the same voices seek to promote their own texts. We turies ago are likely to have large networks of editions and already have mentioned Foucault and his search for the translations, but works that originated recently, if they are “author”; Barthes’ metaphor of text as tissue (1975)—im- associated with large instantiation networks, are more permanent, non-persistent, and utterly interpreted only on likely to have many mutations in their instantiation net- its reception—makes it clear that even textual works, like works. statues, lie in the conscious minds of those who behold We have seen above the two main categories of instan- them. This aligns with Goehr’s reception theory of musical tiation—derivation and mutation. From research it has works (1992). Works are culturally critical, but they also are been demonstrated that bibliographic works have at least impermanent fixtures in the minds of people. the following types (Smiraglia 2002, 11):

4.2 The properties of works Derivations simultaneous editions Works are mentefacts, which means they are abstract and successive editions made up of ideas. Works, obviously, are not the only kinds predecessors of mentefacts that occur in information retrieval, but they are unique in the fact that, as creative expressions, they are amplifications in some sense ideationally fixed.3 Works have two proper- extractions ties, which are referred to as ideational content—the ideas accompanying materials expressed—and semantic content4—the mode of expres- musical presentation sion of the ideas. Either ideational or semantic content notational transcription might be changed in subsequent iterations of a work. Sto- persistent works ries abound about authors arriving at print shops in the middle of a press run and changing one word here or there, Mutations thus resulting in very slight alterations of semantic content translations that (likely) do not affect ideational content. But more of- adaptations ten, works are translated or abridged or reissued with illus- performances trative matter. In these cases, it is possible to trace over 312 Knowl. Org. 46(2019)No.4 R. Smiraglia. Work

Research has shown that the majority of instantiations are ship between two documents that might be instantiations simultaneous editions (works published in more than one of the same work has the same identity as the relationship place at one time), successive editions (second, third, etc., between two documents that might be about the same subsequent editions), and translations (Smiraglia 2001). concept, because they are both members of the same class The distinction between derivation and mutation is the de- (12): gree of alteration in ideational content. We assume altera- tion of the semantic content occurs from one edition to If I decide that Doc 1 instantiates Work A, what that the next. But major change in the presentation of ideas oc- amounts to is a judgment—an entirely subjective curs as the work evolves over time in accord with cultural judgment made by me on a particular occasion—that stimuli, which act as market forces to compel motion pic- Doc 1 has the property of being an instance of Work tures, musical realizations, and so forth (e.g., the motion A, that Doc 1 is a member of the class of documents pictures Prisoner of Zenda 1937 and 1952, with identical that share the property of instantiating Work A … screenplays and music, based on Anthony Hope’s novel of These are just different ways of saying the same 1894, or the motion pictures The Bishop’s Wife 1947 and The thing, and again we can also say that Doc 1 and Doc Preacher’s Wife 1996, based on Robert Nathan’s 1928 novel 2 are similar in the sense that they share the same In the Barley Fields, but with almost unrecognizable charac- property or that they are members of the same class. ters and locales). Market forces are very much present in And again, if it turns out that Work B has exactly the the research on instantiation (Smiraglia 2007b). We have same extension as Work A does—in other words, if seen, for example, that works that are associated with very it turns out that all and only the documents that in- large instantiation networks are more likely to have been stantiate Work A instantiate Work B—then we can published simultaneously at the outset—a strategy well say that Work A is the same work as Work B. known in publishing. In a somewhat different vein, Furner (2009, 10) has sug- A related aspect of the phenomenon of instantiation is the gested that works can be described both as relations re-presentation of content, and it is this property that is among things, and as identities of the properties of rela- endemic in the incorporation of information objects in re- tions themselves: trieval systems in which large clusters of seemingly similar content must be simultaneously gathered and disambigu- There are a number of different kinds of entities that ated. Empirical analysis of this phenomenon in museums are capable of entering into relations with one an- and archives demonstrated the means by which not only other. We might find it convenient to distinguish in visual representations of specific objects but also metadata some way between worlds, works, words, persons, associated with them require control—gathering and dis- and so on. It doesn’t really make much difference ambiguation—around a nominal anchor that usually is the whether we decide to treat these entities as sub- identifier for the work (Smiraglia 2006, 2007d, 2008). This stances that somehow exist separate from their research aligns with similar observations from Coleman properties, or simply as bundles of properties. What- (2002) concerning scientific models,4 and Greenberg ever system of fundamental categories of entities we (2009) with regard to life-cycle modeling of data records settle on, we can also use it as the basis of a taxon- from evolutionary biology. The concept was recently ex- omy of relations between entities. Depending on our tended (Smiraglia 2017) to the re-presentation of data in purposes, we might want to distinguish (a) relations repositories. It is perhaps at this point that the epistemic between works and people from (b) relations be- distinction drawn by Day (2008)—between literary works tween works and other works, for instance. and works of art helps distinguish between the clusters of instantiated realizations of works that reside primarily as This Furner relates to the philosophical positions of nom- texts and the clusters of instantiated re-presentations of inalism (2010, 186), by which he says they exist as sets of metadata associated with works that reside as objects— relationships between linguistic expressions, and realism, helps both to inform the understanding that works of dif- e.g., the concreteness of a stipulated text. That is, a work ferent kinds possess different properties in fact as well as is made up of ideation expressed semantically. Judgments socially and culturally, as well as the comprehension that about how two instantiations might be exemplars of the the central problem of the work for knowledge organiza- same work rely on both nominalist points of view about tion is its treatment as the nominal anchor for clustering in whether both texts bear the same sets of relationships to knowledge organization systems. other texts, and to realist points of view about the exact Still, probably the most important empirical finding match between the textual strings that constitute the ex- from the empirical research is the discovery that there is a pression of the work. He demonstrates that the relation- cultural catalyst for the growth of a family of works all de- Knowl. Org. 46(2019)No.4 313 R. Smiraglia. Work rived from a common progenitor. Initially, borrowing a but phrase from Wilson ([1968], 1978), these were called bib- Beowulf liographic families. In Functional Requirements for Bibliographic or Records (FRBR, IFLA 1998) they are called superworks. Artistotle These are works like Gone with the Wind that have achieved Meteorologica iconic status, and thus for which potentially thousands of iterations have come forth, all of which can be associated a well-established English title is used for works originat- with a common progenitor through shared ideational and ing before 1501. A set of terms (e.g., Selections, or Works, semantic content (Smiraglia 2007a). Nonetheless, not eve- or Plays) are allowed for collocating collections under a rything in the superwork set Gone with the Wind is equiva- single author. Parts of a specific work published separately lent with Margaret Mitchell’s novel. Instead, ideational are entered first under the original work and then qualified nodes within the set (such as a screenplay) are related with the name of the part: works that have their own instantiation sets. The problem for information retrieval, as stated earlier, is to simultane- Tolkien, J.R.R. ously collocate and disambiguate these large sets of instan- Two towers tiations. for part two of Tolkien’s trilogy Lord of the Rings, and 5.0 The major conceptual schema currently in use “Selections” may be used also to designate a set of extracts for representing works from a work:

5.1 AACR2 Gibbon, Edward History of the decline and fall of the Roman Empire. All of this means that bibliographic works are very com- Selections. plex entities to handle in systems for information retrieval. The simple style of cataloging described earlier is insuffi- A translation is entered under the heading for the original cient to disambiguate the large collocated networks of in- work and qualified with the language of translation thus: stantiations associated with many bibliographic works. The Anglo-American Cataloguing Rules, Second Edition Caesar, Julius (AACR2 1998) contains within its complex rules for De bello Gallico. French & Latin. “Headings, Uniform Titles, and References” a set of re- quirements for attribution, denomination, collocation, and The net effect is an alphabetico-classed arrangement of disambiguation of the instantiations of works. Initially, works under their headings. For example, we might find a works are divided into those that are associated with a spe- set like the following in the catalog of a single library: cific creator and those that are not, such as: Dickens, Charles. Works Hemingway, Ernest Dickens, Charles. Selections Sun Also Rises Dickens, Charles. Bleak house and Dickens, Charles. Bleak house. French Episcopal Church Dickens, Charles. Great expectations Book of Common Prayer Dickens, Charles. Great expectations. Selections. Ger- but man Cloud of Unknowing Dickens, Charles. Pickwick papers where works are entered in creator-title citation form, but and so on. This effect is perhaps most pronounced under title alone for works not associated with a particular crea- headings for legal works and the Bible: tor. Denomination of works is dependent on their period of origin, with works promulgated primarily after the in- Bible. English. New Revised Standard. vention of printing from movable type (actually, the year Bible. N.T. Timothy 1500 is stipulated) entered under the version of the original Bible. O.T. Pentateuch title by which they have become known: and so forth. The effect of this arrangement is to accom- Dickens, Charles plish collocation under specific headings and sub-head- Pickwick papers ings, but it leaves disambiguation to chance or to the ex- 314 Knowl. Org. 46(2019)No.4 R. Smiraglia. Work pertise of a user. In a small library the user will likely not stantiations that we saw above in Table 1, and the column have difficulty making a selection from such a file, but in a on the right identifies things like book reviews, that are not bibliographic utility one can retrieve a search result of hun- part of the instantiation network of a work, but rather are dreds of headings for Pickwick Papers with no identifiable works about the work. distinction among them readily apparent. These rules sit at the apex of Anglo-American catalog- 5.2 The FRBR conceptual model and RDA ing tradition stretching from the late eighteenth century to the late twentieth. This tradition relies on an authorship FRBR (IFLA 1998) sets out an entity-relationship model principle that has been shown to occasionally override cul- for the bibliographic record that separates the inventory tural discourse in favor of assigning any work to a personal functions of the catalog that are item-based from the name, no matter how distant the named person might be searching functions that are work- or subject-based. A sim- from the creative task (Smiraglia and Lee 2012). It was only ple schema represents the entities as works, expressions, in the late nineteenth century that sufficient commercial manifestations, and items. In this schema, works remain interest arose in the profitable marketing of authorship, abstract, and items represent physical entities. Expressions which took the form of increased production of works as- and manifestations are the entity names for all forms of sociated with specific names (Smiraglia, Lee, and Olson instantiation, wherein expressions identify specific realiza- 2010). Discourse analysis of the authorship principle re- tions of works, particularly with regard to semantic con- vealed multiple meanings for “author” (Martínez-Ávila et tent, and manifestations identify physical embodiments of al. 2015, 1110): expressions. J.S. Bach’s Art of the Fugue is a work, the score of it or a performance of it are different expressions, and In cataloging tradition, and to some extent in classi- Breitkopf & Härtel and Schirmer editions of the score, or cal bibliography, an author is foremost a named en- Deutsche Grammophon and Nonesuch recordings of a tity to whom intellectual creativity is attributed. But performance are manifestations. The manifestations then also, and almost more importantly, in cataloging and reside in specific, physical items. This set of distinctions bibliographical tradition, as the discourse has been allows the separate inventory of instantiations according to transformed to this date, an author is the name of a their intellectual attributes. Change in ideational content class of related works that can be collocated with the results in a new work, change in semantic content results iconic representation of the named entity … The dis- in a new expression, and the role of the publisher who cussion leads inexorably to the conclusion that an brings an expression to market is recognized in the pro- author is not so much a person who writes, as it is duction of manifestations. Because much of the complex- the name of a class of works that can be related, ei- ity of current online catalogs results from the admixture of ther through power structures or lived experience, entity data in bibliographic records, FRBR promised a with a specific named entity. more articulate, if still complex, approach to access to works and their iterations. The work, then, has been used as the core historical anchor Problems with the FRBR conceptual model inhibited for an alphabetico-classed arrangement of instantiations in its full implementation although many approaches were the library catalog (Smiraglia 2003). This is all to change undertaken in the library and bibliographic utility commu- with the incorporation of the FRBR (IFLA 1998) concep- nity (see, for example, Smiraglia 2013). The most difficult tual model. problems were with the precise implementation of the ex- The following illustration (Figure 1) demonstrates the pression entity and with gaps in the model, principally for limits of librarianship’s ability to comprehend either the aggregate works (works that include other works, such as core ontological importance of works or the complexity anthologies, or journals) (Le Boeuf 2006; Smiraglia 2012). revealed by empirical research. Derived from Tillett (2004, Research into the nature and treatment of aggregates fed 4; 2001, 23) the figure arrays categories of work content into a 2015 report by an IFLA working group and is care- relationships in the form of a trajectory that embraces the fully reported in O’Neill, Žumer and Mixter (2015). Ag- point at which library cataloging rules distinguished be- gregates, which are arguably themselves “works,” were de- tween “versions” of a work and emergent “new” works.5 termined to be very common occurring more than 20% of Without reference to the research on mutation or that the time in one sample (128); the majority were antholo- on expression and manifestation entities, this diagram gies, conference proceedings, scholarly journals and com- shows some of the kinds of publications in library catalogs pilations (127). A multiplicity of aggregations of expres- that require collocation and disambiguation of instantia- sions of works—original essays, reprinted articles, transla- tions of works. The column to the left identifies copies, tions, etc.—commonly occur in aggregates. A conceptual the central column lists kinds of derivative “mutation” in- model consisting of three types of aggregation—collec- Knowl. Org. 46(2019)No.4 315 R. Smiraglia. Work

Figure 1. Tillett’s (2004, 8; 2001, 23) content relationships. tions, augmentation aggregates, and parallels—was re- works, for each of which there might be many ex- ported in O’Neill and Žumer (2012). pressions. Not all expressions spawn manifestations, Many of these problems have been tackled in a reinter- and so forth. Also a distinction is made between the pretation of the FRBR conceptual model from its original intellectual work, and publication events, which entity-relationship model into an object oriented model might spawn manifestations. (Bekiari et al. 2015), which is harmonized with the cultural heritage information sharing ontology known as the In 2010 a new set of international (but mostly Anglo- CIDOC -CRM. Known as FRBRoo, the empirically-based American in practice) cataloging instructions were intro- object-oriented model overcomes the earlier difficulties by duced under the rubric “Resource Description and Ac- removing temporal requirements for expressions and al- cess” (RDA). RDA embraces the FRBR entity-relationship lowing aggregation (Smiraglia 2015, 297): conceptual model and divorces problems in transcription from manifestations or inventory control of items from Entities are broken into associated phenomena those of representing works and their creators. There is named “objects,” and are oriented to each other by very little difference between RDA and AACR2 in the their attributes. The various models, then, do not outcome of work identifiers—an alphabetico-classed sys- rely on temporality or mutually exclusive classes, but tem of works, ordered by their preferred titles6 under au- rather on associative principles of linked attributes. thorized access points for their creators still is used—but FRBR’s “W-E-M-I” (works-expressions-manifesta- more flexibility is available for multiple attribution where tions-items) entities in FRBRoo become objects that creatorship is complex or even unattributed. Fuller imple- may be associated multiply according to their related mentation of the FRBR conceptual model through the use attributes. In a given instantiation network derived of RDA is leading to the more appropriate representation from an ideational conception there might be many of works in information retrieval systems, with both better 316 Knowl. Org. 46(2019)No.4 R. Smiraglia. Work clustering and better disambiguation. Problems still remain graphic and the other encompassing a musical sound re- in RDA with works that are performed and recorded (Smi- cording aggregate were demonstrated (Smiraglia 2015), raglia 2007c)—so that no distinction is made between the and an experiment using FRBRoo to do the same with in- work that is a recording of a performance of Eroica Sym- stantiated open government data also has proven fruitful phony and the work that is its performance—although (Park and Smiraglia 2017). FRBRoo provides a mechanism to do so. The most recent development incorporating works into 6.0 Conclusion a functional conceptual model is represented by the 2017 Library Reference Model (LRM), which offers a harmoniza- In this essay, a variety of points of view about the work tion of the entire family of FRBR conceptual models and and its nature have been surveyed, many of them stem- is itself harmonized with the CIDOC CRM. The LRM de- ming from diverse epistemological understandings. It is fines the “work” entity as “the intellectual or artistic con- now widely accepted in librarianship that a work is a delib- tent of a distinct creation” and its expression as “a distinct erately created informing entity intended for communica- combination of signs conveying intellectual or artistic con- tion, and that a work consists of abstract intellectual con- tent” (Žumer 2018, 312). tent that is distinct from any object that might be its car- rier. Works of art inhere in less abstract form in the objects 5.3 Incorporating works in classified arrays that result from the activity of technologically creating them; those that persist do so in specifics of time and Recognition of the classified nature of lists of works pro- place. Works are mentefacts—mental constructs—but as duced by cataloging rules has led to the interesting idea such they also are cultural artifacts reflecting social values. that FRBRoo-designated work entities might themselves Works also are ontological realities, which makes them ob- be used to classify instantiations by their incorporation as jects for knowledge organization, with properties of idea- auxiliaries into faceted classifications such as the Universal tion and communicative attributes (often referred to as se- Decimal Classification (UDC). Such treatment relies on mantic) that are used to positively identify and then bound understanding the role of works as taxonomic elements of them. Works and their re-presentations instantiate across canons made up of “the concatenation of mutable mutat- time and thereby lies the unity that links all competing def- ing instantiations” (Smiraglia and van den Heuvel 2013, initions. In knowledge organization, the importance of a 378): work is its role as nominal anchor. It matters not whether a work has a sequence of instantiations or exists as an ob- A canon is the literature accepted as foundational for ject with representations; what matters in knowledge or- a domain, and therefore, a canon can be as broad or ganization is the identity associated with a work, which it- narrow as its domain. It is canonicity, or acceptance self becomes an iconic conceptual entity in knowledge or- into a canon, that has been demonstrated to be asso- ganization systems. ciated with a high degree of instantiation. Put more If bibliographic reality conforms to Patrick Wilson’s vi- simply, a work or a set of works, once accepted into sion ([1968] 1978) of a bibliographic universe made up of a canon, become in demand, which causes more edi- a vast concept space in which related entities move vari- tions, translations, adaptations, commentaries, etc. ously in consort dependent on the intensity or vagueness to be generated by the domain. These canons pro- of their inter-relationships, works constitute the celestial vide the warrant for most classificatory activity in bodies that populate it. Works lie at the center of galaxies KO. Instantiation has been shown to be a contin- of instantiating points. However appealing we find such a uum along which ideation is combined with intellec- metaphor, the reality is that works are essential entities tual force into the expressions of works (Smiraglia both as cultural mentefacts and as targets for information [2008]). Motion is the pathway of ideation in the pro- retrieval. Although this reality has been recognized for a cess of instantiation. long time, it is only of late that we have gathered sufficient empirical evidence of the works-phenomenon to allow the A mechanism for using the works entity to link elements more powerful relational structure that will underlie future of traditional conceptual classification strings either to ex- information retrieval systems. ternal ordering systems (such as document retrieval sys- tems) or to W-E-M-I-specific identifiers with as auxiliaries Notes has been demonstrated to be consistent with concepts of multi-versal or multi-dimensional knowledge organization 1. Day (2008) references several works by Heidegger, but (Smiraglia, van den Heuvel and Dousa 2011). Two cases the most critical to his point seems to be Heidegger using FRBRoo to delineate instantiated works, one biblio- ([1964] 1977). VRA Core is a set of online data stand- Knowl. Org. 46(2019)No.4 317 R. Smiraglia. Work

ards and schema for the “description of images and Congress magazine Technicalities (v. 25, no. 5 2003). The works of art and culture” maintained by the Visual Re- illustration originated in Tillett’s 2001 chapter in Bean sources Association and the Library of Congress and Green’s Relationships in the Order of Knowledge. (https://www.loc.gov/standards/vracore/). The pamphlet and chapter are cited here. 2. Copyright legislation as a source of cultural warrant for 6. RDA Toolkit (http://access.rdatoolkit.org/) 5.5 “When works is discussed in Smiraglia (2001, 68-72), specifi- constructing an authorized access point to represent a cally with reference to copyright protection, which sub- work or expression, use a preferred title for work (see sists in “works of authorship” that may be “literary, mu- 6.2.2) as the basis for the access point” and 6.2.2.4 “For sical, dramatic, pantomime or choreographic, pictorial, works created after 1500, choose as a preferred title for graphic or sculptural, motion picture or audiovisual, work the title or form of title in the original language by sound recordings, and architectural” (71). An anony- which the work is commonly identified either through mous referee made reference to Warner (1993), which use in manifestations embodying the work or in refer- relies on similar material to establish a historical con- ence sources.” nection between the development of computing from pre-existing information technologies. References 3. An anonymous reviewer asks “are all kinds of mentefacts works?” The answer is no, because a work is, as defined Anglo-American Cataloguing Rules, Second Edition. (AACR2), in the first paragraph, “a deliberately created informing ed. Michael Gorman and Paul W. Winkler. 1988 revi- entity intended for communication.” A mentefact is not sion. Chicago: American Library Assn. by nature a work, but only becomes one if it is created in Barthes, Roland. 1975. The Pleasure of the Text, trans. Rich- a form intended for communication. ard Miller with a note on the text by Richard Howard. 4. An anonymous referee suggests “symbolic” is a better New York: Noonday Press. term than semantic, to distinguish the signified aspects Bekiari, Chryssoula, Martin Doerr, Patrick Le Boeuf and of a work. The choice of terms is not so simple. The Pat Riva, eds. 2015. FRBR Object-Oriented Definition and

term “semantic content” as is explained in Smiraglia Mapping from FRBRER, FRAD and FRSAD. Version 2.1. (2001, 31ff.) is derived from research by Carpenter Heraklion: International Working Group on FRBR and (1981, 118-20) who relied on work by Wilson ([1968] CIDOC CRM Harmonisation. http://www.cidoc-crm. 1978) and Domanovsky (1974). Thus, the term is the org/frbr_drafts.html result of the inheritance of empirical thought into the Buckland, Michael. 2018. “Document Theory.” Knowledge nature of a work in information science. That musical Organization 45: 425-36. doi:10.5771/0943-7444-2018-5- notation, for example, is not “semantic” in the same 425 was as verbal text is obvious but also disregards the pur- Carlyle, Allyson. 1999. “User Categorisation of Works: To- pose of musical notation. Musical notation might be ward Improved Organisation of Online Catalogue Dis- symbolic to some, but to a musician it is entirely seman- plays.” Journal of Documentation 55:184-208. tic. Thus, the term “symbolic,” which is admittedly en- Carpenter, Michael. 1981. Corporate Authorship: Its Role in ticing, is incorrect. Works are mentefacts, embodied by Library Cataloging. Westport, Conn.: Greenwood Press. ideational content and communicated by semantic con- Coleman, Anita S. 2002. “Scientific Models as Works.” tent. That syntactic content might also be useful is dis- Cataloging & Classification Quarterly 33, nos. 3-4: 129-59. cussed in Smiraglia and van den Heuvel (2013). Day, Ronald E. 2008. “Works and Representation.” Journal 4. An anonymous referee asks “Is Einstein’s theory of rel- of the American Society for Information Science and Technology ativity a work” and suggests that “work seems not to be 59: 1644-52. [an] important concept in (natural) scientific communi- Domanovsky, Ákos. 1974. Functions and Objects of Author ties.” But, there are citations throughout this article to and Title Cataloguing: A Contribution to Cataloguing Theory, Greenberg’s work on life-cycle modeling from bota- English text ed. by Anthony Thompson. Budapest: nists, and Coleman’s groundbreaking work on scientific Akademiai Kiado. models as works has long needed replication. In fact, Foucault, Michel. 1984. “What is an Author?” In Foucault what the referee observes is that the hard sciences are Reader, ed. Paul Rabinow. Harmondsworth: Penguin, less populated by instantiating monographs and more 101-20. populated by un-tracked instantiating models. Is Ein- Furner, Jonathan. 2009. “Interrogating ‘Identity’: A Philo- stein’s theory a work? No, but the text in which he in- sophical Approach to an Enduring Issue in Knowledge troduced it is. Organization.” Knowledge Organization 36:3-16. 5. The color illustration is from a Library of Congress pamphlet, which itself was reprinted from a Library of 318 Knowl. Org. 46(2019)No.4 R. Smiraglia. Work

Furner, Jonathan. 2010. “Philosophy and Information Stud- brary Association. In RDA Toolkit: http://www.rda ies.” Annual Review of Information Science and Technology 44. toolkit.org Medford, N.J: Wiley, 159-200. Smiraglia, Richard P. 1992. “Authority Control and the Ex- Gnoli, Claudio. 2018. “Mentefacts as a Missing Level in tent of Derivative Bibliographic Relationships.” Ph.D. Theory of Information Science.” Journal of Documentation diss., University of Chicago. 74: 1226-42. doi:10.1108/JD-04-2018-0054 Smiraglia, Richard P. 1996. “The Light in the Piazza: Goehr, Lydia. 1992. The Imaginary Museum of Musical Works: Glimpses of the Bibliographic Universe.” In Ventures An Essay in the Philosophy of Music. Oxford: Clarendon. in Research Series 17, ed. Sheila McKenna. Brookville, Greenberg, Jane. 2009. “Theoretical Considerations of Life- NY: Faculty of the C.W. Post Center, Long Island Uni- cycle Modeling: An Analysis of the Dryad Repository versity, 99-120. Demonstrating Automatic Metadata Propagation, Inher- Smiraglia, Richard P. 2001. The Nature of “A Work:” Impli- itance, and Value System Adoption.” Cataloging & Classi- cations for the Organization of Knowledge. Lanham, MD: fication Quarterly 47, no. 4: 380-402. Scarecrow Press. Heidegger, Martin. (1964) 1977. “The End of Philosophy Smiraglia, Richard P., ed. 2002a. Works as Entities for Infor- and the Task of Thinking.” In Basic Writings: From Being mation Retrieval. New York: Haworth Press. and Time (1927) to The Task of Thinking (1964), ed. David Smiraglia, Richard P. 2002b. “Works as Signs, Symbols, Farell Krell. New York: Harper & Row, 370-92. and Canons: The Epistemology of the Work.” Knowledge IFLA (International Federation of Library Associations and Organization 28: 192-202. Institutions, Study Group on the Functional Require- Smiraglia, Richard P. 2003. “The History of ‘The Work’ in ments for Bibliographic Records). 1998. Functional Re- the Modern Catalog.” Cataloging & Classification Quarterly quirements for Bibliographic Records. München: K.G. Saur. 35 no. 2/4: 553-67. Leazer, Gregory H. and Furner, Jonathan. 1999. “Topo- Smiraglia, Richard P. 2006. “Empiricism as the Basis for logical Indices of Textual Identity Networks.” In Pro- Metadata Categorization: Expanding the Case for In- ceedings of the 62nd Annual Meeting of the American Society for stantiation with Archival Documents.” In Knowledge Or- Information Science, October 31, 1999, ed. L. Woods. Med- ganization and the Global Learning Society, Proceedings of the ford, NJ: Information Today, 345-58. 9th International ISKO Conference, Vienna, July 4-7, 2006, LeBoeuf, Patrick, ed. 2006. FRBR: Hype or Cure-all? Bing- ed. Gerhard Budin, Christian Swertz and Konstantin hamton, NY: Haworth Information Press. Mitgutsch. Advances in Knowledge Organization 10. Martinez-Ávila, Daniel, Richard P. Smiraglia, Hur-Li Lee Würzburg: Ergon Verlag, 383-88. and Melodie Fox. 2015. “What Is an Author Now? Dis- Smiraglia, Richard P. 2007a. “Bibliographic Families and course Analysis Applied to the Idea of an Author.” Jour- Superworks.” In Understanding FRBR: What it is and How nal of Documentation 71: 1094-114. it Will Affect our Retrieval Tools, ed. Arlene G. Taylor. Li- O’Neill, Edward and Maja Žumer. 2012. “Modeling Aggre- braries Unlimited, 73-86. gates in FRBR.” Cataloging & Classification Quarterly 50, Smiraglia, Richard P. 2007b. “The ‘Works’ Phenomenon nos. 5-7: 456-72. doi:10.1080/01639374.2012.679547 and Best Selling Books.” Cataloging & Classification Quar- O’Neill, Edward, Maja Žumer and Jeffrey Mixter. 2015. terly 44, no. 3/4: 179-95. “FRBR Aggregates: Their Types and Frequency in Li- Smiraglia, Richard P. 2007c. “Performance Works: Contin- brary Collections.” Library Resources & Technical Services uing to Comprehend Instantiation.” In Proceedings of the 59: 120-9. North American Symposium on Knowledge Organization. June Panizzi, Antonio. (1848) 1985. “Mr. Panizzi to the Right 14‐15 Toronto, Canada,” ed. Joseph T. Tennis. http:// Hon. the Earl of Ellesmere.” In Foundations of Cataloging, dlist.sir.arizona.edu/view/conference/North_American ed. Michael Carpenter and Elaine Svenonius. Littleton, _Symposium_on_Knowledge_Organization_2007.html Colo.: Libraries Unlimited, 15-47. Smiraglia, Richard P. 2007d. “When is a Terracotta Hut Urn Park, Hyoungjoo and Richard P. Smiraglia. 2017. “Ontolog- like a Sailor’s Deck-log? Meaning Instantiated Across ical Data-Sharing of Open Government Data for Data Virtual Boundaries.” Paper delivered at Museums & the Curation.” Hyoungjoo Park and Richard P. Smiraglia. Web 2007, April 11-14, San Francisco, California. Canadian Journal for Information and Library Science 41: 285- https://www.museumsandtheweb.com/mw2007/pa 307. pers/smiraglia/smiraglia.html Petek, Marija. 2007. “Derivative Bibliographic Relation- Smiraglia, Richard P. 2008. “A Meta‐analysis of Instantia- ships in the Slovenian Online Catalogue COBIB.” Jour- tion as a Phenomenon of Information Objects.” Culture nal of Documentation 63: 398-423. del testo e del documento 9 no. 25: 5-25. RDA (Resource Description and Access). 2010- Chicago: American Library Association; Ottawa: Canadian Li- Knowl. Org. 46(2019)No.4 319 R. Smiraglia. Work

Smiraglia, Richard P. 2012. “Be Careful What You Wish Svenonius, Elaine. 2001. The Intellectual Foundation of Infor- for: FRBR, Some Lacunae, a Review.” Cataloging & mation Organization. Cambridge: MIT Press. Classification Quarterly 50 no. 5/7: 360-68. Talbot, Michael. 2000. The Musical Work: Reality or Invention? Smiraglia, Richard P., ed. 2013. The FRBR Family of Concep- Liverpool Music Symposium 1. Liverpool: Liverpool tual Models, ed. Richard P. Smiraglia; co-ed. Pat Riva and University Press. Maja Zumer. London: Taylor and Francis. Tanselle, G. Thomas. 1989. A Rationale of Textual Criticism. Smiraglia, Richard P. 2015. “Bibliocentrism Revisited: Philadelphia: University of Pennsylvania Press. RDA and FRBRoo.” Knowledge Organization 42: 296-301. Tillett, Barbara B. 2001. “Bibliographic Relationships.” In Smiraglia, Richard P. 2017. “Data and Metadata Instantia- Relationships in the Organization of Knowledge, ed. Carol A. tion: Use Cases and a Conceptual Model.” Paper read at Bean and Rebecca Green. Dordrecht: Kluwer, 19-36. DC-2017, DCMI International Conference on Dublin Tillett, Barbara B. 2004. “What is FRBR?: A Conceptual Core and Metadata Applications, October 26-29, Wash- Model for the Bibliographic Universe.” Washington, ington, D.C. http://dcevents.dublincore.org/IntConf/ DC: Library of Congress, Cataloging Distribution Ser- dc-2017/paper/view/516 vice. https://www.loc.gov/cds/downloads/FRBR.PDF Smiraglia, Richard P. and Charles van den Heuvel. 2013. Vellucci, Sherry L. 1997. Bibliographic Relationships in Music “Classifications and Concepts: Elementary Theory of Catalogs. Lanham, MD: Scarecrow Press. Knowledge Organization.” Journal of Documentation 69: Verona, Eva. 1985. “Literary Unit Versus Bibliographical 360‐83. Unit.” In Foundations of Cataloging, ed. Michael Carpenter Smiraglia, Richard P. and Hur-li Lee. 2012. “Rethinking the and Elaine Svenonius. Littleton, CO: Libraries Unlim- Authorship Principle.” Library Trends 61, no. 1: 35-48. ited, 155-75. Smiraglia, Richard P., Charles van den Heuvel and Thomas Warner, Julian. 1993. “Writing and Literary Work in Cop- Dousa. 2011. “Interactions Between Elementary Struc- yright: A Binational and Historical Analysis.” Journal of tures in Universes of Knowledge.” In Classification & the American Society for Information Science 44: 307-21. Ontology: Formal Approaches and Access to Knowledge: Pro- Wilson, Patrick. (1968) 1978. Two Kinds of Power: An Essay ceedings of the International UDC Seminar 19-20 September in Bibliographical Control. California Library Reprint Se- 2011, The Hague, Netherlands, ed. Aïda Slavic and Ed- ries. Berkeley: University of California Press. gardo Civallero. Würzburg: Ergon Verlag, 25-40. Yee, Martha M. 1993. “Moving Image Works and Manifes- Smiraglia, Richard P., Hur-li Lee and Hope A. Olson. tations.” PhD diss. University of California, Los Angeles. 2010. “The Flimsy Fabric of Authorship.” In Information Žumer, Maja. 2018. “IFLA Library Reference Model Science: Synergy through Diversity, Proceedings of the 38th An- IIFLA LRM)-Harmonisation of the FRBR Family.” nual CAIS/ACSI Conference, Concordia University, Mon- Knowledge Organization 45: 310-18. doi:10.5771/0943- treal, Quebec. June 2 - 4 2010, ed. Elaine Ménard and Va- 7444-2018-4-310 lerie Nesset. Strout, Ruth French. 1956. “The Development of the Cata- log and Cataloging Codes.” Library Quarterly 26: 254-75.

320 Knowl. Org. 46(2019)No.4 J. Saarti. Fictional Literature: Classification and Indexing

Fictional Literature, Classification and Indexing† Jarmo Saarti University of Eastern Finland, Kuopio Campus, Library, P.O.Box 1627 Kuopio 70211, Finland,

Jarmo Saarti is Library Director, University of Eastern Finland and an adjunct professor (docent) at University of Oulu and University of Tampere. He has been chair of the board of the National Repository Library in Finland and member of the board of the National Library of Finland; member of IFLA’s standing committee of the Academic Libraries and Other Research Libraries Section and is at present member of the Standing Committee of the Section on Document Delivery and Resource Sharing; chair of the Finnish Research Library Association 2014-2017 and board member for the Federation of the Finnish Learned Societies.

Saarti, Jarmo. 2019. “Fictional Literature: Classification and Indexing.” Knowledge Organization 46(4): 320-332. 75 references. DOI:10.5771/0943-7444-2019-4-320.

Abstract: Fiction content analysis and retrieval are interesting specific topics for two major reasons: 1) the ex- tensive use of fictional works; and, 2) the multimodality and interpretational nature of fiction. The primary challenge in the analysis of fictional content is that there is no single meaning to be analysed; the analysis is an ongoing process involving an interaction between the text produced by author, the reader and the society in which the interaction occurs. Furthermore, different audiences have specific needs to be taken into consideration. This article explores the topic of fiction knowledge organization, including both classification and indexing. It provides a broad and analytical overview of the literature as well as describing several experimental approaches and developmental projects for the analysis of fictional content. Traditional fiction indexing has been mainly based on the factual aspects of the work; this has then been expanded to handle different aspects of the fictional work. There have been attempts made to develop vocabularies for fiction indexing. All the major classification schemes use the genre and language/culture of fictional works when subdividing fictional works into subclasses. The evolution of shelf classification of fiction and the appearance of different types of digital tools have revolutionized the classification of fiction, making it possible to integrate both indexing and classification of fictional works.

Received: 26 March 2019; Revised: 3 May 2019; Accepted: 9 May 2019

Keywords: fiction, indexing, classification, fictional, subject, works

† Derived from the article of similar title in the ISKO Encyclopedia of Knowledge Organization, Version 1.0, published 2019-03-26 . Article category: KO in specific domains. The author would like to thank Birger Hjørland, who served as the initiator and editor for this article, the two anonymous reviewers who provided valuable feedback that increased the value of this article, and Mark Ward for bringing me back on the subject and especially Ewen Macdonald for linguistic advice.

1.0 Introduction fiction indexing and classification was, and still sometimes seems to be a political issue. As Eriksson stated (2010, vii): There are several reasons why fiction content analysis and retrieval are interesting topics within the knowledge man- An early significant event is an extensive classifica- agement and organization of documents, i.e., the practical tion of fiction carried out by the Free Library of need for fiction retrieval has remained unabated while the Philadelphia in the very beginning of the 20th cen- possibilities for creating retrieval systems for fiction have in- tury. This work becomes a national issue in the USA creased. This can be traced to the development of comput- when the classification is discussed for a few years at erised environments for information retrieval and especially the ALA’s annual congress, but it ends up being dis- for the dissemination of fictional works by both commercial missed. The thesis [i.e., Eriksson’s work] argues that internet-based vendors and the public sector. These devel- this decision stopped the development of classifica- opments have applied a multifaceted approach of analysing tion for fiction for decades, and quite possibly it is and describing texts, as this is an important feature of char- one of the reasons why bibliographic systems, even acterizing and finding the appropriate works of fiction. One in the 1980s, did not reflect the topics or themes of must remember that fiction is the most popular type of lit- fiction. Only eighty years later did the ALA change erature, especially in public libraries. its mind and from 1990, fiction has been indexed in The history of active content analysis of fiction is surpris- USA and Denmark, and this may be anticipated to ingly short, only about one hundred years. The need for the spread to many other countries. Knowl. Org. 46(2019)No.4 321 J. Saarti. Fictional Literature: Classification and Indexing

The inexorable spread of the internet, especially from the Figure 1). Because of the special nature of a work of fic- beginning of this millennium served as an impetus for the tion, the reception of the work of art is not fulfilled unless organization of fictional knowledge, e.g., the development all the above actors participate in the process. The role of of specialized and fast information retrieval systems. First the writer is to write works of art—novels, short stories, the different vendors, e.g., commercial bookstores, pub- poems, plays—to be published. The role of the work of lishers, even individual readers started to utilize classifica- art is to be a medium through which the artist can com- tion and indexing as well as other tools in their internet municate with his/her audience. However, the work of art services. The evolving statistical and social media types of has its own, autonomous life; after the book has been pub- tools were also incorporated into both commercial and li- lished, the writer can only have a role as one of its readers, brary information systems. Furthermore, the internet cre- i.e., an interpreter of the work. ated totally new tools for promoting fiction and support- The role of the reader is that he or she is an interpreter ing a reading culture (Collins 2010; Ross, McKehnie and of a work of art. The interpretation as well as the creation Rothbauer 2018; Birdi and Ford 2017) of a work of art takes place in a social-historical context There is already some evidence that enriched result lists that defines the language used and its means of artistic ex- and multiple entry points to fiction may help users to lo- pression. Without a common language, there can be no cate books (Mikkonen and Vakkari 2016, 67), whereas a communication between the writer and her or his readers. simple access point is not as useful (Wilson et al. 2000). In This influences the search for fiction; the knowledge about addition, the search strategies used by readers to locate fic- authors, works and their likeness to other works of art are tion have been analysed and found to support the multi- major factors when searching for fiction and the systems modal nature of the fiction searching as well as consider- should support this fact (Ross 2001). ing the needs of each individual reader trying to find fic- It is also typical for fictional communication that it is a tion (Saarinen and Vakkari 2013, 752-3). two-way street. One can first consider it in terms of factual The gradual shift to the digital distribution of infor- meanings, e.g., references to actual happenings, historical mation has meant that one needs new tools for analysing events and geographical facts etc. (see, e.g., Ranta 1991, 20- the contents of fictional material as well as for its indexing. 23). On the other hand, it has an aesthetic facet, but this In other words, texts and other materials that have not will be based on the individual interpretation and recep- been analysed, described and classified and/or indexed in tion. That influences the content description; on the one full text databases are hard or even impossible to retrieve. hand, objective grounds can be identified, but on the other Another reason why we need to take a fresh approach to hand, some aspects are subjective and thus personal and the content analysis of fictional material is that a free text diverse. This dichotomy was apparent in Saarti’s study, search is not efficient when searching fictional material. where test persons indexed and abstracted novels. The in- This becomes apparent if we compare it with the search dexing was found to be very inconsistent (Saarti 2002), and and retrieval of publications in the natural sciences, where one could characterize the abstracts in the following cate- even though the text and content may be very topical, its gories (Saarti 2000): retrieval is usually rather straightforward. From the viewpoint of information science, the analy- – Abstracts that describe the structure and content sis of fictional texts and the information dissemination of the novel (plot/thematical abstract). process of fictional works clearly challenge but also enrich – Abstracts that describe the position of the novel the traditional theoretical models and thus expand the the- in its writer’s list of works or describe the novel’s oretical tools and concepts underpinning this field of re- position in the literary canon (cultural/historical search (see, e.g., Beghtol 1994b and 1997; Green 1997; abstract). Ward and Saarti 2018). – Abstracts that describe the reading experience. This article evaluates the methods and tools for organ- – Critical abstracts. izing fictional knowledge with a special emphasis on the content representation of fiction mainly from the perspec- Adkins and Bossaller (2007) conducted an analysis of the tive of public libraries. access point to fiction in computer-mediated book infor- mation sources. They stated (354) that: 2.0 Information process of fiction Online bookstores may be effective tools for librari- The main actors in the information process of fiction are ans helping patrons find ‘good’ books because of the work of art, its creator, i.e., the writer, the reader and their increased use of access points. However, reader the social-historical environment where the publishing and advisory databases, which contain reviews and sub- reception takes place (Beghtol 1986, 93; Saarti 2000a, see ject headings, are occasionally more effective than 322 Knowl. Org. 46(2019)No.4 J. Saarti. Fictional Literature: Classification and Indexing

Figure 1. Communication process of fiction. (Adapted from Segers 1985, 72 and Martens 1975, 36; Saarti 2000a)

online bookstores for identifying books published and interpreted in completely different ways and these prior to the 1990s. need different types of tools and approaches for their management They list (368) altogether thirty-five different types of ac- Thus, it is evident that the primary challenge for the fic- cess points that they found in databases to fictional works tion content analysis is that there is no single topical mean- including contents, cataloguing information, visual infor- ing to be analysed; in fact, the analysis is an ongoing pro- mation, plot information, reviews etc. ject due to the nature of the fictional process, i.e. there is Vernitski (2007) has proposed a model for managing a continual interaction between the author, text, society the intertextuality of fictional works. She postulated (47- and reader. Furthermore, different audiences have their 48) that there are the following nodes for the intertextual specific own needs that must be taken into consideration. references: quotation, allusion, variation, sequel and pre- quel. She stated that these types of indexes could be espe- 3.0 Aspects of fiction content description cially useful for the research community. Thus, the organ- ization of fictional knowledge is also dependent on the Ranta (1991) has drawn a distinction between two basic point-of-view of the target audience: fiction can be read kinds of elements to be indexed in fictional works—deno- Knowl. Org. 46(2019)No.4 323 J. Saarti. Fictional Literature: Classification and Indexing tative and connotative. Denotative or factual elements con- is especially true in the case of emotional experience that sist of facts in fictional works, such as the setting, personae does not belong to the work itself but to the reader. Cate- and factual elements of the plot. Connotative or imagina- gorizing the author’s intention is also problematic, because tive elements consist of elements interpreted from fic- it is difficult, if not impossible, to define from the work of tional works, e.g., the theme and its interpretation and is- art what was the author’s intention. In addition, as Wellek sues arising from the expressional aspects of the work of and Warren already mentioned, the author can misinter- art. (Ranta 1991, 20-23) Ranta has utilized Shatford’s ap- pret his or her own intention: “It happens to all of us that proach for indexing photographs, based on Panofsky’s the- we misinterpret or do not fully understand what we have ory. Shatford divided the meaning into two categories, i.e., written some time ago (Wellek & Warren 1980, 148).” Fur- factual and expressional forms. The difference between thermore, in order to define the author’s intention, we these two categories is that the factual meanings are objec- would have to ask the author him/herself—which would tive while the expressional meanings are subjective. “The be very difficult, time-consuming and in many cases com- former describes what the picture is Of, the latter, what it pletely impossible. is About.” Thus, the indexing of the factual meanings is far Andersson and Holst modified Pejtersen’s classification more straightforward than that of the expressional mean- in their study, which was based on interviews of 100 users ings (Shatford 1986, 42-50 emphasis original). in two Swedish public libraries; they then analysed the de- It has also been typical that traditional classifications of scriptions of the novels’ plots and compared them with fiction have a very theoretical foundation, especially the the library’s indexes (Andersson and Holst 1996, 88). Their traditional denotative classification systems. They are model included the following categories: phenomena, the mainly built on the tradition of historical linguistics origi- frame and the author’s intention. nating from the romantic era and ideologies with an edu- Andersson and Holst have added some important as- cational basis. Unfortunately, in these approaches, the pects to Pejtersen’s categories that belong to fictional com- needs of the users are ignored. This was one of the rea- munication, e.g., a borrowed motif, a subtler analysis of sons why Pejtersen carried out her study in Danish public the phenomena of fictional works and a category related libraries to determine what the users wanted to be classi- to modifications as well as additions to the author’s inten- fied/indexed from the novels. As a result, she divided the tion, in which they have used a more neutral concept of questions of the interviewed users into four categories: message complemented with the reader’s experience. subject matter, frame, author’s intention and accessibility It is interesting to note that the above categories do not (Pejtersen and Austin 1983, 234). include fundamental aspects of the work of art: the aes- Pejtersen’s categories can be divided into denotative thetic and/or moral value of the work. Of course, one rea- (subject matter and frame) and a connotative (author’s in- son is that valuing is usually very subjective and thus fits tention) aspects. Furthermore, she has included aspects poorly with the traditional neutral approach of indexing that are usually left to the cataloguing of books in terms and classifying works. On the other hand, when the valuing of group accessibility (e.g., physical characteristics). This of a work of art is omitted, one and perhaps the most im- shows that a system for fiction, created according to the portant aspect of an aesthetic object, is ignored. It also reader’s wishes must be multi-faceted and include both de- seems that users do want valuing of works of art. This can notative and connotative aspects; some that are easily rec- be observed in many forms, e.g., in marketing, criticism or ognizable and traditional, as well as some that are unfamil- knowledge that the book has been a candidate for a pres- iar to the present systems of classifying and indexing (e.g., tigious award and prize in literature etc. evaluating). Pejtersen’s results also indicate that the clear It can also be seen that the aspects to be indexed or division between cataloguing and classifying/indexing is classified are mostly limited to those that are as objective of no relevance to users—their only interest is in locating (denotative) as possible. Pejtersen as well as Andersson and the works of art they need as easily as possible. Thus, Holst have added a few mutable/fuzzy categories that are Green stated that the indexing terms of fiction should be based upon readers’ experiences. Nonetheless, there is divided into two categories—subject terms and attribute some aspects totally missing from the categories men- terms. The former is those “that reflect what a document tioned above, i.e., the history of different interpretations or a user need is about.” However: “This leaves attribute of a work of art as well as its position in the literary-his- indexing to reflect such other characteristics of documents torical continuity. In some cases, this aspect could be inter- and user needs as language, regency, author affiliation, in- esting and enlightening. In this respect, the author and tended audience, and so on” (Green 1997, 86). his/her role have secondary roles in the above categories. The most problematic aspect in Pejtersen’s scheme is On the other hand, this reveals that we must make clear the author’s intention, because this is based on the in- definitions about what aspects are worth indexing in fic- dexer’s point of view, i.e., on his/her interpretation. This tional works. In addition, it clearly indicates that the sys- 324 Knowl. Org. 46(2019)No.4 J. Saarti. Fictional Literature: Classification and Indexing tems for indexing fiction are clearly dependent on the en- tion schemes have later expanded to include certain spe- vironment for which they have been created. cific classes of subject matter. These have remained the We can see from the schemes described above that the basic foundations of the main classification systems (see, traditional type of fiction indexing is mainly based on factual e.g., Beghtol 1989 and 1990). The literary genre, time of aspects. According to Nielsen (1997), these should be ex- publication and geographical region are useful bases for tended to incorporate aspects of thematical factors, as well classification. They can be considered to belong to the tra- as the features of the narrational structures. This is needed dition of historical linguistics used for classifying lan- because in modern and post-modern fiction, the main point guages and their literature. They can also be viewed as is how it is told and not what is told. The third aspect that providing an objective basis for the classification. How- Nielsen emphasises as one way of improving fiction index- ever, these classification systems leave the idea of describ- ing would be the inclusion of both cultural and historical ing the subject content of fiction—what the fiction is facts that have affected the work, e.g., artistic schools and about—untouched (see also Bierbaum 1995, 390). cultural periods (see also Negrini and Adamo 1996 where The studies on the classification of fiction can be di- there is a more precise analysis of the literature domain). vided into two categories—those that discuss the shelf For the classification of the fiction, the different litera- classification of fiction and those that believe that the clas- ture genres have often been used as a basis for the classifi- sification should be a means to provide a content descrip- cation (see more Rafferty 2012). In this respect, a genre tion of fiction. means literally a kind or a class. However, as Chandler Fiction classification studies have constantly emphasised (1997, 1) stated, the concept of genre is problematic in the fact that the content description of fiction will neces- several ways. The concept of genre is often used in a bio- sarily be multi-faceted. Thus, Beghtol claimed in her study logical way, i.e., in biology a genre can be thought of as a examining the different fiction classification schemes: genealogically defined species, whereas in literature, genres “Characters, Events, Spaces and Times may be taken as fun- are continually being re-defined. damental data categories for fiction” (Beghtol 1994a, 157). There also seem to be different layers in genre definition. Pejtersen (Pejtersen and Austin 1983 and 1984) made the In fiction, the broadest genres are poetry, prose and drama same kind of claim in an empirical study on the basic aspects and their consequent subdivisions. This classical definition that patrons use while searching fiction for themselves. Pe- can be seen in the traditional classification schemes. When jtersen’s studies imply also that indexing and classification— using specific genres as a basis for classification, one has to especially with respect to fiction—are merging into more bear in mind that: “The classification and hierarchical tax- holistic schemes where classes are described by indexing onomy is not a neutral and ‘objective’ procedure. There are terms and vice versa. User-friendly systems such as Pe- no undisputed ‘maps’ of the system of genres within any jtersen’s BookHouse (Pejtersen 1989), have adopted this medium (though literature may perhaps lay some claim to a type of classification with indexing terms as class notations. loose consensus). Furthermore, there is often a considerable Previous studies on fiction indexing can be divided into theoretical disagreement about the definition of specific two categories; the first consists of those that discuss fic- genres” Chandler (1997, 1). For an example of the complex- tion indexing and the principles behind it at a general level. ity of fiction genres, see Appendix 1. The second category includes those that deal with the cre- ation of book indexes. The studies on book indexes have 4.0 Classification and indexing of fiction been mostly carried out in Anglo-American cultures, which have a long tradition of book indexing, but some Because of the nature of fiction, it has proved very diffi- work has been done in the Nordic countries, especially in cult to separate the indexing from the classification of fic- Denmark. tion: there are several significant facets to be considered in These studies have discussed the management of the the indexing, and classification schemes thus become complexity of fiction in indexing, as well as the concept of multi-faceted. In fact, some classification schemes use key- “aboutness” in fiction retrieval (Andersson and Holst words as class notations. 1996; Beghtol 1992; Bell 1991; Pulli 1992; Ranta 1991; One major feature of fiction indexing, and classifica- Moraes 2012). There are also publications with some sim- tion studies has been the problem of identifying those as- ilarities to these studies that have discussed the possibilities pects that are worth indexing and/or classifying in individ- of creating AI systems for fiction, because those systems ual works. Traditionally, the general classification systems are basically built upon indexes (Rich 1979 and 1986). Fur- have utilized a literary basis (specifically genre), the year of thermore, there are several reports describing experiments publication (sometimes with the reference to an epoch) of fiction indexing in various libraries (e.g., MacPherson and the country of publication and/or the writer (some- 1987, who examined the creation of children’s literature times with a reference to cultural regions). Some classifica- indexes in a school environment). Knowl. Org. 46(2019)No.4 325 J. Saarti. Fictional Literature: Classification and Indexing

5.0 Classification practices and principles for fiction mance, thrillers and (For the definition of these genres, see Trott 2017). All the major classification schemes used in libraries have The third and the most challenging way is to try to clas- included fiction. UDC and Dewey use the genre and lan- sify the entire fiction stock. Two different approaches have guage/culture of fictional works when dividing works to been applied; in the first, the whole stock is divided into subclasses. The following subdivision is an example utiliz- classes without any distinction made between recreational ing the Dewey system: and serious fiction (see, e.g., Burgess 1936; Saarti 1997b). In the other model, the fiction stock is initially divided into – 820 English & Old English literatures two main classes—recreational and serious fiction—and – 821 English poetry then those main classes are divided into subcategories (see, – 822 English drama e.g., Spiller 1980, 241). – 823 English fiction The idea of dividing fiction to classes based on genres – 824 English essays has also been utilised in the present commercial and library – 825 English speeches software used in the internet. All the major internet – 826 English letters bookshops have developed their own genre-based classifi- – 827 English humour and satire cations for fiction (Wikipedia has a list of fifty-three – 828 English miscellaneous writings “genre” categories for fiction with a total of 528 subcate- gories; see Appendix 1). In addition, statistical tools are Because of the analytical-synthetic and multi-faceted nature used which analyse the user’s preferences in order that they of the UDC, one can also apply a special auxiliary subdivi- can recommend new fiction to their customers. The users sion for literary forms, genres, techniques and different lan- can also create their own recommendation lists that are guages. The Colon Classification is rather like the UDC, apply- published. This type of social and statistical knowledge or- ing the following facets for fiction: language, form, author, ganization is also used in different types of so-called fan work (http://www.isko.org/cyclo/colon_classification, see fiction sites (Smith 2017). also Satija 2017). The major change here is that in a digital environment, These main classification schemes have been utilised as the classification is not tied to physical shelves and thus the a basis for the shelf classification of fiction, which has concept of having a multimodal classification can be real- been an important aspect in developing the classification ized, i.e., the same fictional work can be in different classes of fiction. The shelf classification of fiction has the long- at the same time. This has also enhanced the integration est tradition in the Anglo-American libraries. The classes between the indexing and classification of fiction (see, e.g., used have mainly been recreational and popular fiction Pawlicki 2017). genres, e.g., thrillers, horror, romances. The reason for us- ing these genres is very clear—recreational genres are used 6.0 Development of fiction thesauri and ontologies in advertising, these books are often published in series and, they are usually written in the form of a certain genre The thesauri and subject heading lists for fiction started to which is targeted to certain readers—the rules of reading evolve from the needs of individual libraries and/or be- and writing generic fiction are very clear in recreational fic- cause of the initiative of a single individual. Subsequently, tion. On the other hand, there are various and heterogenic these started to expand and recently we have also seen sys- sets of genre classifications especially for the printed stock tems operating at the national level. At first, they have been and these are used in both libraries and bookstores. mostly simple word-lists or general thesauri/subject head- Historically we can separate three different ways of de- ing lists that have been supplemented with terms for fic- veloping a shelf classification of fiction. The oldest and tion. Based on these experiments, the subject heading lists most widely used system is to separate a few well-known and fiction thesauri have evolved in order to strive for genres from the rest of the fiction stock. Usually these unity of indexing and centralised cataloguing services genres are also the most popular for the users of the li- (Pulli 1992). In the Nordic countries, there is an on-going brary for example, detective novels are considered as a dis- project, based on the ideas of the BookHouse concept. Its tinct shelf class in nearly every public library (Harrell 1985, main objective is to enable the dissemination of the cata- 14; Juntunen and Saarti 1992, 108; Jennings, Barbara and loguing data of fiction between the Nordic countries (Pe- Sear, Lyn 1989). The second step in shelf classification is jtersen et al. 1996, 75). to separate popular fiction from the fiction stock and ar- In the United States, the development started at the na- range it according to genres (see, e.g., Alternative arrange- tional level when the American Library Association’s Sub- ment 1982, 75-76). Usually here, the most popular genres ject Analysis Committee published their Guidelines on Sub- of fiction are shelved separately, e.g., science fiction, ro- ject Access to Individual Works of Fiction, Drama etc. In the 326 Knowl. Org. 46(2019)No.4 J. Saarti. Fictional Literature: Classification and Indexing guidelines, the committee recommended that the follow- terms for indexing fiction (Pulli 1992, 2-4). Based on the ing aspects should be indexed from fictional works: experiences of these pilot projects, as well as those of the form/genre, persons, setting and topics. Based on this rec- Finnish project based on the BookHouse concept, it soon ommendation and on the twenty-three-page supplemen- became apparent that there was a need for a centralised tary word list for the Library of Congress Subject Headings, a indexing service for fiction. This service was needed, be- project was started in 1991, when ten libraries began to cause indexing of fiction is laborious; it lacks traditions index fiction. In addition, Olderr has devised a supplemen- and guidelines, for example, a subject heading list and, fur- tary list of fiction subject headings, which is broader than thermore, there has been no decision about which thesau- the LC thesaurus (Young 1992, 89-94; see also Young and rus should be followed. Mandelstam 2013). The first edition of Olderr’s fiction The Helsinki University Library—also the National Li- subject headings was published in 1987 and as a thesaurus brary of Finland—decided together with the BTJ Group in 1991. It includes terms from six different categories: Ltd to initiate a project in order to make a subject-heading topics, genres, geographical settings, chronological set- list for fiction. The editing was started in the fall of 1993, tings, characters and treatment (of the theme). The latter and in addition to deciding who would be the editor, an are terms that describe more specifically the genre of the editorial board was appointed to oversee the project. The work (Olderr 1991, ix-xx). The American Library Associ- subject-heading list was soon changed into the form of a ation (2000) has also published rules for the subject head- thesaurus in order to match it to the other thesauri pub- ings, which are intended to ease access to fiction. lished by the Helsinki University Library. The first version In Sweden, the largest thesaurus is Jansson’s and Söder- was then tested in Finnish public libraries, and finally the vall’s Tesaurus för indexering av skönlitteratur (Thesaurus for In- first edition of Kaunokki was released in 1996 (in Swedish dexing Fiction), which was published in 1987. It is divided Bella 1997). into two parts—systematic and alphabetical—with the for- The principal problem in devising a subject-heading list mer being arranged as a thesaurus. In the systematic part, for fiction was deciding on the structure under which the the terms are divided into three main facets, which are set- terms were to be collected and organised. The editorial ting (ram), persons (person) and subject (ämne). These are board of Kaunokki decided that the subject headings divided into sub-facets so that setting is divided into time should be arranged in the form of a thesaurus and the or- (tid) and place (rum); persons are divided into development ganisation of the thesaurus should be made to follow the (utveckling), social relations (sociala relationer) and profes- facets mentioned in the previous studies on the classifica- sion/occupation (yrke/verksamhet) and subjects are di- tion and indexing of fiction. In addition, an alphabetical vided into ideology (ideologi), action (aktivitet), nature (na- index of all the terms used was added to the end of the tur) and human body (människokropp). As stated by the ed- thesaurus. itors, the borders between different facets are not fixed and The facets used were as follows: placing some of the terms only in one facet is based pre- dominantly on the principles of the design of this thesaurus – Terms that describe fictional genres and their explana- in which each term can be placed only in one facet (Jansson tions. and Södervall 1987, 4-6). In the Nordic countries, several – Terms that describe events, motives and themes. subject-heading lists have been developed based on the – Terms that describe actors. BookHouse concept (see the Pejtersen section above, Sec- – Terms that describe settings. tion 3.0, see also Eriksson 2005). – Terms that describe times. The Swedish Library Association’s Fiction Indexing – Terms that describe other, mostly technical and typo- Committee was inaugurated in 2005. As a result of this graphical aspects. Committee’s work, two subject heading lists were produced, i.e., subject headings of fiction for children and subject Four of the above-mentioned facets—events, actors, spaces headings of fiction for adults. The subject headings have a and times—have been mentioned in almost all the previous hierarchical and faceted structure: 1) genre; 2) date; 3) set- studies as the main categories being applied for fiction in- ting; 4) subject; 5) character; and, 6) form. For children’s lit- dexing. Thus, Beghtol drew the conclusion (1994a, 157) that: erature, form and genre are combined as form/genre. (Aa- “Characters, Events, Spaces and Times may be taken as fun- gaard and Viktorsson 2014, 68) damental data categories for fiction.” In Finland, there have also been some experiments con- If we compare Beghtols list to Ranganathan’s PMEST ducted on indexing fiction by Finnish librarians and Finn- facets—as Shatford undertook in her system for indexing ish book traders before the appearance of Finnish Thesaurus pictures (Shatford 1986, 49)—we can see that those are very for Fiction. They all used the Finnish General Thesaurus but similar to Shatford’s MEST (matter, energy, space, time) fac- very soon it was appreciated that it lacked the appropriate ets. In her system, Shatford made the decision to combine Knowl. Org. 46(2019)No.4 327 J. Saarti. Fictional Literature: Classification and Indexing personality and matter facets into one group—actors, and ent types of fiction databases, i.e., library, commercial and then she referred with the energy facet to what these actors fan-based environments. The ontology-based approach were doing. In Kaunokki, the solution was that terms that could help in improving this situation (see also Rafferty describe the genre of the fictional work were considered to 2018 on social tagging). correspond to the personality facet. This seems logical be- cause the genre or the kind of literature describes the per- 7.0 Systematic approach to the fictional knowledge sonality of the work and in fact determines many of the organization events, spaces and times described in a novel (see, e.g., Wellek and Warren 1980, 226-237; Saarti 1999). The matter It is apparent that not only the indexing and classification facet on the other hand corresponds to that of events and but also the search and retrieval systems for fiction must motives in Kaunokki and the energy facet to that of actors. become multi-faceted in order to meet the diverse needs By incorporating Ranganathan’s Basic Subject (Ranga- of different users. Figure 2 describes a model for a search nathan 1969, 200), one could also make a distinction be- and retrieval system of fiction (Saarti 2000a). It consists of tween different types of fictional works. five main blocks (databases) that represent the different In the group “other,” mainly terms that describe aspects actors of the fictional communication system—works of outside the factual text of the work were included, because art (texts), their subject indexing and abstracts, history of they are regularly asked by library users. For example, these their reception by readers, history of the writers and cul- are the previously mentioned aspects included in Pejtersen’s tural history (see, e.g., Spiter and Pecoskie 2016). With the accessibility category (Pejtersen and Austin 1983, 234). aid of this kind of system, one can document in a holistical When collecting the terms for the thesaurus, it was ob- manner the different aspects of the meaning of a work of vious that the context where the thesaurus is used would fiction, i.e., what the work of fiction is about. play an important role in choosing the right terms and the During the past three decades, we have seen a rapid appropriate depth of the terms being chosen. A concrete growth in various types of information systems for works example of that was the subject headings for the indexing of fiction. Figure 2 is a framework for the various layers of of juvenile literature. They were included in Kaunokki, alt- the system’s contents. As discussed earlier, the greatest chal- hough they could as well have been published in a separate lenge in the analysis of fictional content is its interpreta- special thesaurus. Another problem was considering the tional character. This means that a user-analysis is of the ut- environment where the thesaurus would be used. From the most importance when evaluating the pros and cons of any very onset, the decision was made that Kaunokki should system. be suitable for public libraries. For this reason, a great It seems that the commercial systems are incorporating many of the terms that students of literature would con- more content elements and especially more user behaviour- sider important aspects of fictional works were omitted based data into their systems. For example, this can be seen from the thesaurus. One solution for this problem would when comparing Amazon books’ user interface (https:// be to create a Thesaurus for Literary Research, which is cur- www.amazon.com) and WorldCat’s FictionFinder (https:// rently under preparation. There is already an example of experimental.worldcat.org/xfinder/fictionfinder.html). This this in Italy—Thesaurus di letteratura italiana (Negrini and multi-faceted use of tools and different types of access Zozi 1995; see also Negrini and Adamo 1996; Aschero et points seem to be very useful when searching for fiction. al. 1995). In the second edition of Kaunokki (Saarti The aesthetic point of view has also given new possibilities 2000b), this aspect was incorporated. Kaunokki was also for fiction retrieval, e.g., as can be seen in Whichbook.net developed in order to make it a thesaurus for the entire (https://www.whichbook.net//), where the user can utilize spectrum of fiction, i.e., literature, movies, comics etc. factor-based search tools with more interpretational type of The Kaunokki has also been implemented as an ontol- data. The third, and maybe the most rapidly evolving envi- ogy-based linked metadata-based service and this has been ronment, are the different types of user-motivated infor- utilized when creating the Finnish BookSampo service for mation systems, e.g., fan-fiction sites and services that utilize fictional works. BookSampo is a semantic portal, encapsu- a lot of unstructured fiction content analysis that is based lating metadata about practically all Finnish fiction litera- on the users’ needs (e.g., https://www.fanfiction.net/ and ture available in Finnish public libraries (Mäkelä , Hypén Smith 2017). and Hyvönen 2011, 173; Saarti and Hypén 2010). As Branch et al. (2017) emphasize, there is a great need 8.0 Conclusions for the ontological structures of fiction. This is because: 1) of the multi-faceted nature of the fiction; and, 2) the One can conclude from the studies conducted on indexing active and broad culture of fan fiction. It seems that there and abstracting of fictional works that the effect of the is no structural coherence and consistency between differ- interpretation of the work of art has a major impact on 328 Knowl. Org. 46(2019)No.4 J. Saarti. Fictional Literature: Classification and Indexing

Figure 2. A broad model for a search and retrieval system for fiction. the content description of the work. This highlights the In addition, cultural and functional aspects are im- importance of these tools for librarians and patrons, they portant from both the scientific and practical viewpoints. should not be so restrictive that they control the content The multicultural point of view is especially interesting as well as the vocabulary used in the indexing of (fictional) with respect to fiction. Centralised indexing services for works. Of course, the interpretational aspect of content fiction have been available in several countries for years, description is a subject that requires clarification, not only and their experiences can be a basis for assessing the ben- for fictional works but also for scientific material. efits and drawbacks of a centralised service. Additional studies will be needed in order to improve the There is much work to be done in developing better in- indexing and classification of fiction. One important topic formation systems for handling fiction. In fact, at times it is the effect of the environment on indexing and whether seems to be a never-ending task if one wishes to devise the environment impacts on the use of indexes, which is more sophisticated and more tailored indexing and classi- also crucial for understanding the relationship between cen- fication systems (e.g., see Bartlet and Hughes 2011). The tralised and local indexing. Furthermore, democratic index- latest technological possibilities have created truly revolu- ing in different libraries—a model that enables the users to tionary tools for fictional retrieval. These have opened new contribute to the indexing—requires more investigation. perspectives for totally new types of indexing: e.g., emo- This could be one model through which we could incorpo- tional indexing referring to the reader’s experience and rate the interpretations and opinions of different individuals promotional tools for fictional literature. For libraries, this into our information systems (see Hidderley and Rafferty will also mean soul-searching, i.e., librarians need to decide 1997 and investigations of the development in the search what they must concentrate on in this field, what is best and retrieval systems of the internet book-stores). Knowl. Org. 46(2019)No.4 329 J. Saarti. Fictional Literature: Classification and Indexing left for other actors and finally identify areas where co-op- ed. Nancy J. Williamson and Michelè Hudon. Amster- eration will be most beneficial. dam: Elsevier, 39-48. Beghtol, Clare. 1994a. The Classification of Fiction: The Devel- References opment of a System Based on Theoretical Principles. Metuchen, NJ: Scarecrow Aagaard, Harriet and Elisabet Viktorsson. 2014. “Subject Beghtol, Clare. 1994b. “Domain Analysis, Literary War- Headings for Fiction in Sweden: A Cooperative Devel- rant, And Consensus: The Case of Fiction Studies.” opment.” Cataloging & Classification Quarterly 52, no. 1: Journal of the American Society for Information Science 46: 30- 62-8. doi: 10.1080/01639374.2013.855603 44. Adkins, Denice and Jenny E. Bossaller 2007. “Fiction Ac- Beghtol, Clare. 1997. “Stories: Applications of cess Points Across Computer-Mediated Book Infor- Discourse Analysis to Issues in Information Storage mation Sources: A Comparison of Online Bookstores, and Retrieval.” Knowledge Organization 24: 64-71. Reader Advisory Databases, And Public Library Cata- Bell, Hazel K. 1991. “Indexing Fiction: A Story of Com- logs.” Library & Information Science Research 29, no. 3: plexity.” The Indexer 17: 251-6. 354-68. Bierbaum, Esther. 1995. Review of The Classification of Fic- American Library Association. Subcommittee on the Re- tion: The Development of a System Based on Theoretical Princi- vision of the Guidelines on Subject Access to Individ- ples by Clare Beghtol. Journal of American Society for Infor- ual Works of Fiction. 2000. Guidelines on Subject Access to mation Science 46: 389-90. Individual Works of Fiction, Drama, etc. 2nd ed. Chicago: Birdi, Briony and Nigel Ford. 2018. “Towards a New So- American Library Association. ciological Model of Fiction Reading.” Journal of the As- Ainley, Patricia and Barry Totterdell, eds. 1982. Alternative sociation for Information Science and Technology 69: 1291-303, Arrangement: New Approaches to Public Library Stock. Lon- doi: 10.1002/asi.24053 don: Association of Assistant Librarians. Branch, Frank Theresa Arias, Jolene Kennah, Rebekah Andersson, Rolf and Erik Holst. 1996. “Indexes and Phillips, Travis Windleharth and Jin Ha Lee. 2017. Other Depictions of Fiction: A New Model For Anal- “Representing Transmedia Fictional Worlds Through ysis Empirically Tested.” Svensk biblioteksforskning nos. 2- Ontology.” Journal of the Association for Information Science 3: 77-95. and Technology 68: 2771-82. Aschero, B., G. Negrini, R. Zanola and P. Zozi. 1995. Burgess, L. A. 1936. “A System for The Classification and “SYSTEMATIFIER: A Guide for The Systematisation Evaluation of Fiction.” The Library World 38: 179-82. Of Italian Literature.” In Konstruktion und Retrieval von Chandler, Daniel. 1997. “An Introduction to Genre The- Wissen: 3. Tagung der Deutschen ISKO Sektion einschliesslich ory.” http://visual-memory.co.uk/daniel/Documents/ der Vortragë des Workshops “Thesauri als terminologische Lex- intgenre/intgenre1.html ika”, Weilburg, 27.-29.10.1993, ed. Norbert Meder, Peter Collins, Jim. 2010. Bring on the Books for Everybody: How Lit- Jaenecke and Winfried Schmitz-Esser. Fortschritte in erary Culture Became Popular Culture. Durham, NC: Duke der Wissensorganisation 3. Frankfurt/Main: Indeks University Press. Verlag, 125-33. Eriksson, Rune. 2005. “Skønlitteraturen i DanBib: Klassi- Bartlett, Sarah and Bill Hughes. 2011. “Intertextuality and fikation, indeksering, noter.” Dansk biblioteksforskning 1, the Semantic Web: Jane Eyre as a Test Case for Model- no. 3: 7-20. ling Literary Relationships with Linked Data.” Serials: Eriksson, Rune. 2010. klassifikation og indeksering af skønlitter- The Journal for the Serials Community 24, no. 2: 160-5. atur: Et teoretisk og historisk perspektiv. Copenhagen: Royal Beghtol, Clare. 1986. “Bibliographic Classification Theory School of Library and Information Science. https://cu and Text Linguistics: Aboutness Analysis, Intertextual- ris.ku.dk/ws/files/47028127/Eriksson_phd_2010.pdf. ity and The Cognitive Act of Classifying Documents.” Green, Rebecca. 1997. “The Role of Relational Structures Journal of Documentation 42: 84-113. in Indexing for The Humanities.” Information Services & Beghtol, Clare. 1989. “Access to Fiction: A Problem in Use 17: 85-100. Classification Theory and Practice, Part 1.” International Harrell, Gail. 1985. “The Classification and Organization Classification 16: 134-40. of Adult Fiction in Large American Public Libraries.” Beghtol, Clare. 1990. “Access to Fiction: A Problem in Public Libraries, 24: 13-4. Classification Theory and Practice, Part 2.” International Hidderley, Rob and Pauline Rafferty. 1997. “Democratic Classification 17: 21-7. Indexing: An Approach to The Retrieval of Fiction.” Beghtol, Clare. 1992. “Toward A Theory of Fiction Anal- Information Services & Use 17: 101-9. ysis for Information Storage and Retrieval.” In Classifi- Jansson, Eiler and Bo Södervall. 1987. Tesaurus för indexering cation Research for Knowledge Representation and Organization, av skönlitteratur. Högskolan i Borås. Institutionen Bibli- 330 Knowl. Org. 46(2019)No.4 J. Saarti. Fictional Literature: Classification and Indexing

otekshögskolan. Specialarbete 1987:7. Borås: Högskolan Olderr, Steven. 1991. Olderr’s Fiction Subject Headings: A Sup- i Borås. plement and Guide to The LC Thesaurus. Chicago: Ameri- Jennings, Barbara and Lyn Sear. 1989. “Novel Ideas: A can Library Association. Browsing Area for Fiction.” Public Library Journal 4, no. 3: Pawlicki, Kamil. 2017. “Genre Theory Applied: Genre and 41-4. Form Terms in the National Library of Poland Cata- Juntunen, Arja and Jarmo Saarti. 1992. Kaunokirjallisuuden logue.” Paper presented at IFLA WLIC 2017, Wrocław, sisällönkuvailu yleisissä kirjastoissa. Kirjastotieteen- ja in- Poland. http://library.ifla.org/id/eprint/1644 formatiikan pro gradu-tutkielma. Oulu: Oulun yliopisto. Pejtersen, Annelise Mark, Hanne Albrechtsen, Lena MacPherson, Ruby. 1987. “Children’s Literature Indexes at Lundgren, Ringa Sandelin and Riitta Valtonen. 1996. Moray House.” Library Review 4: 254-60. Subject Access to Scandinavian Fiction Literature: Index Meth- Martens, Gunter. 1975. “Textstrukturen aus rezep- ods and OPAC Development. TemaNord Culture 608. Co- tionsästhetischer Sicht: Perspektiven einer Textästhetik penhagen: Nordic Council of Ministers. auf der Grundlage des Prager Strukturalismus.” In Lit- Pejtersen, Annelise Mark and Jutta Austin. 1983. “Fiction erarische Rezeption: Beiträge zur Theorie des Text-Leser-Verhält- Retrieval: Experimental Design and Evaluation of a nisses und seiner empirischen Erforschung, ed. Hartmut Heuer- Search System Based on Users’ Value Criteria: Part 1.” man, Peter Hühn and Brigitte Rötger. ISL 4. Paderborn: Journal of Documentation 39: 230-46. Ferdinand Schöningh, 23-49. Pejtersen, Annelise Mark and Jutta Austin. 1984. “Fiction Mikkonen, Anna and Pertti Vakkari. 2016. “Finding Fic- Retrieval: Experimental Design and Evaluation of a tion: Search Moves and Success in Two Online Cata- Search System Based on Users’ Value Criteria: Part 2.” logs.” Library & Information Science Research 38: 60-8. doi: Journal of Documentation 40: 25-35. 10.1016/j.lisr.2016.01.006 Pejtersen, Annelise Mark. 1989. “A Library System for In- Moraes, João. 2012. “Aboutness In Fiction: Methodologi- formation Retrieval based on a Cognitive Task Analysis cal Perspectives for Knowledge Organization.” In Cate- and Supported by an Icon-Based Interface.” In Proceed- gories, Contexts and Relations in Knowledge Organization: Pro- ings of the 12th Annual International ACMSIGIR Conference ceedings of the Twelfth International ISKO Conference 6-9 Au- on Research and Development in Information Retrieval June 25- gust 2012 Mysore, India, ed. A. Neelameghan and K. S. 28, 1989, Cambridge, MA, ed. N.J. Belkin and C.J. van Raghavan. Advances in Knowledge Organization 13. Rijsbergen, special issue, ACM SIGIR Forum 23, SI: 40- Wurzburg:̈ Ergon, 242-48. 7. doi: 10.1145/75335.75340 Mäkelä, Eetu, Kaisa Hypén and Eero Hyvönen. 2011. Pulli, Riitta. 1992. Kaunokirjallisuuden keskitetty indeksointi Su- “BookSampo: Lessons Learned in Creating a Semantic omessa. Helsinki: Helsingin yliopiston kirjasto. Portal for Fiction Literature.” In The Semantic Web-ISWC Rafferty, Pauline. 2012. “Epistemology, Literary Genre and 2011: 10th International Semantic Web Conference, Bonn, Ger- Knowledge Organisation Systems.” In 20 Años del many, October 23-27, 2011, Proceedings. Part II, ed. Lora Capítulo Español de ISKO: Actas del X Congreso ISKO Capí- Aroyo, Chris Welty, Harith Alani, Jamie Taylor, Abraham tulo Español, Ferrol,a 30 de junio - 1 de julio de 2011, ed. María Bernstein, Lalana Kagal, Natasha Noy and Eva del Carmen Pérez Pais and María G. Bonome. Cursos, Blomqvist. Lecture Notes in Computer Science 7032. congresos e simposios 132. Ferrol: Universidade da Co- Berlin: Springer, 173-88. ruña, Servizo de Publicacións, 553-65. Negrini, Giliola and Giovanni Adamo. 1996. “The Evolu- Rafferty, Pauline 2018. “Tagging.” Knowledge Organization tion of a Concept System: Reflections on Case Studies 45: 500-16. of Scientific Research, Italian Literature and Humani- Ranganathan, S. R. 1969. “Colon Classification Edition 7 ties Computing.” In Knowledge Organization and Change: (1971): A Preview.” Library Science with a Slant to Documen- Proceedings of the Fourth International ISKO Conference Wash- tation 6: 193-242. ington, DC, July 15-18, 1996, ed. Rebecca Green. Ad- Ranta, Judith A. 1991. “The New Literary Scholarship and vances in Knowledge Organization 5. Frankfurt/Main: a Basis for Increased Subject Catalog Access to Imagina- Indeks, 275-83. tive Literature.” Cataloging & Classification Quarterly 14, no. Negrini, Giliola and al. 1995. Thesaurus di letteratura Italiana. 1: 3-27. Note di bibliografia e di documentazione scientifica 59. Rich, Elaine. 1979. “User Modeling via Stereotypes.” Cogni- Roma: C.N.R. tive Science 3: 329-54. doi: 10.1207/s15516709cog0304_3 Nielsen, Hans Jørn. 1997. “The Nature of Fiction and Its Rich, Elaine. 1986. “Users as Individuals: Individualizing Significance for Classification and Indexing.” Information User Models.” In Intelligent Information Systems: Progress Services & Use 17: 171-81. and Prospects, ed. R. Davies. Chichester: Ellis Horwood. Knowl. Org. 46(2019)No.4 331 J. Saarti. Fictional Literature: Classification and Indexing

Ross, Catherine S. 2001. “Making Choices: What Readers McDonald and Michael-Devine Clark. 4th ed. Boca Ra- Say About Choosing Books to Read for Pleasure.” The ton, FL: CRC Press, 6: 4242-50. Acquisition Librarian 13, no. 25: 5-21. Ward, Mark and Jarmo Saarti. 2018. “Reviewing, Rebutting, Ross, Catherine Sheldrick, Lynne McKechnie and Paulette and Reimagining Fiction.” Classification, Cataloging & Clas- M. Rothbauer. 2018. Reading Still Matters: What the Research sification Quarterly 56, no. 4: 317-29. doi: 10.1080/016393 Reveals about Reading, Libraries, and Community. Santa Bar- 74.2017.1411414 bara, CA: ABC-CLIO. Wellek, René and Austin Warren. 1980. Theory of literature. Saarinen, Katariina and Pertti Vakkari. 2013. “A Sign of a Harmondsworth: Penguin. Good Book: Readers’ Methods of Accessing Fiction in Vernitski, Anat. 2007. “Developing an Intertextuality-Ori- The Public Library.” Journal of Documentation 69: 736-54. ented Fiction Classification.” Journal of Librarianship and doi: 10.1108/JD-04-2012-0041 Information Science 39, no. 1: 41-52. doi: 10.1177/09610006 Saarti, Jarmo, ed. Kaunokki: Kaunokirjallisuuden asiasanasto. 07074814 1996. Helsinki: BTJ Kirjastopalvelu. Wilson, Mary D., Jodi L. Spillane, Colleen Cook and Anne Saarti, Jarmo, ed. Bella: Specialtesaurus för skönlitteratur. L. Highsmith. 2000. “The Relationship Between Sub- 1997a. Helsingfors: BTJ Kirjastopalvelu. ject Headings for Works of Fiction and Circulation in Saarti, Jarmo. 1997b. “Feeding with The Spoon, Or the Ef- an Academic Library.” Library Collections, Acquisitions, fects of Shelf Classification of Fiction on the Loaning and Technical Services 24: 459-465. doi: 10.1016/S1464- of Fiction.” Information Services & Use 17: 159-69. 9055(00)00156-1 Saarti, Jarmo. 1999. “Fiction Indexing and the Development Young, J. Bradford. 1992. Review of Olderr’s Fiction Subject of Fiction Thesauri.” Journal of Librarianship and Infor- Headings: A Supplement and Guide to the LC Thesaurus, by mation Science 31: 85-92. Steven Olderr. Cataloging & Classification Quarterly 15, no. Saarti, Jarmo. 2000a. “Taxonomy of Novel Abstracts 1: 89-94. Based on Empirical Findings.” Knowledge Organization 27: Young, Janis and Yael Mandelstam. 2013. “It Takes a Vil- 213-20. lage: Developing Library of Congress Genre/Form Saarti, Jarmo, ed. 2000b. Kaunokki: fiktiivisen aineiston Terms.” Cataloging & Classification Quarterly 51, no. 1-3: asiasanasto. Helsinki: BTJ Kirjastopalvelu. 6-24, doi: 10.1080/01639374.2012.715117 Saarti, Jarmo. 2002. “Consistency of Subject Indexing of Novels by Public Library Professionals and Patrons.” Appendix 1. Journal of Documentation 58: 49-65. Saarti, Jarmo and Kaisa Hypén. 2010. “From Thesaurus to Category: Fiction by genre. From Wikipedia, The Free En- Ontology: The Development of the Kaunokki Finnish cyclopedia. C refer to number of subcategories; Science fic- Fiction Thesaurus.” The Indexer 28, no. 9: 50-58. tion, for example, have 21 subcategories, total = 528 cate- Satija, M.P. 2017. “Colon Classification (CC).” Knowledge gories and P refers to the number of Wikipedia pages in the Organization 44: 291-307. category. Segers, Rien T. 1985. Kirja ja lukija: johdatusta kirjallisuuden- tutkimuksen uuteen suuntaukseen, trans. Lili Ahonen. Tietol- Fictional characters by genre C( 17) ipas, 97. Helsinki: SKS. Translation of Het lezen van litera- Fiction writers by genre C( 21) tuur: een inleiding tot een nieuwe literatuurbenadering, 1980. Shatford, Sara. 1986. “Analyzing the Subject of a Picture: Absurdist fiction(C, 61 P 2) A Theoretical Approach.” Cataloging & Classification Adventure fiction(C, 27 P 19) Quarterly 6, no. 3: 39-62. Children’s literature(C, 28 P 21) Smith, Joanna. 2017. “The Ultimate Guide to Fanfiction and Christian fiction(C, 12 P 7) Fanfiction Sites.” Medium (blog), Dec 19. https://me Christianity in fiction(C, 10 P 8) dium.com/@joannasmith008/fanfiction-428029544a12 Coming-of-age fiction(C, 41 P 6) Spiller, David. 1980. “The Provision of Fiction for Public Crossover fiction(C, 26 P 12) Libraries.” Journal of Librarianship 12, no. 4: 238-66. Fiction narrated by a dead person(C, 66 P 1) Spiter, Louise and Jen Pecoskies. 2016. “In the Readers’ Dystopian fiction(C, 36 P 15) Own Words: How User Content in the Catalog Can En- Environmental fiction books(C, 77 P 1) hance Readers’ Advisory Services.” Reference & User Ser- Erotic fiction(C, 7 P 8) vices Quarterly 56, no. 2: 91-5. Family saga(C, 6 P 1) Trott, Barry. 2017. “Popular Literature Genres.” In Ency- Fantasy(C, 6 P 21) clopedia of Library and Information Sciences, ed. John D. Feminist fiction(C, 24 P 4) Fiction with unreliable narrators(C, 258 P 2) 332 Knowl. Org. 46(2019)No.4 J. Saarti. Fictional Literature: Classification and Indexing

Ghost stories(C, 30 P 6) Women’s fiction 2) C, 9 P) Historical fictionC, 51 17) P, 2 F) (C, 5 P 8) Horror fiction(C, 50 P 25) Young adult fiction(C, 53 P 4) Islam in fiction(C, 31 P 4) Islamic fiction(C, 2 P 2) Pages in category “Fiction by genre” (This list may not re- LGBT fiction C( 10) flect recent changes). Men’s fiction C( 1) Metafiction(C, 11 P 4) Anti-romance Military fiction(C, 26 P 8) Atomic bomb literature (C, 17 P 3) Authoritarian literature Motorcycling in fiction(C, 5 P 5) Bizarro fiction (C, 43 P 22) Caper story (C, 12 P 2) Cell phone novel Novels by genre(C, 2 P 72) Comic novel Occult detective fiction(C, 31 P 8) Overpopulation fiction P( 43) Parallel literature(C, 32 P 1) Ethnofiction Penny dreadfuls P( 5) Existentialist fiction Philosophical fiction(C, 11 P 4) Exploitation fiction Political fiction(C, 14 P 10) Fabulation Psychological fiction(C, 10 P 8) Fragmentary novel Pulp fiction(C, 25 P 10) Hysterical realism Rapid human age change in fiction P( 16) I Novel Rapid human growth change in fiction P( 4) Invasion literature Fiction about religion C, 19 30) P) Künstlerroman Romantic fiction(C, 15 P 14) Musical fiction Science fiction(C, 7 P 21) New adult fiction Speculative fiction(C, 33 P 39) Northern (genre) (C, 5 P 20) Urban fiction Thrillers(C, 21 P 16) Western (genre) Urban fiction P( 19) Young adult fiction Utopian fiction(C, 30 P 3) Young adult romance literature Western (genre)(C, 15 P 20)

Knowl. Org. 46(2019)No.4 333 Books Recently Published

Books Recently Published Compiled by J. Bradford Young

DOI:10.5771/0943-7444-2019-4-333

Abrahamse, Ben. 2019. Sudden Position Guide: Cataloging and Johnson, Eric O. 2019. Working as a Data Librarian: A Practi- Metadata. Chicago: Collection Management Section of cal Guide. Santa Barbara, CA: Libraries Unlimited. the Association for Library Collections & Technical Ser- Maguire, Rachael. 2019. Information Rights for Records Managers. vices. London: Facet. Balzer, Wolfgang and Karl R. Brendel. 2019. Theorie der Wis- Martin, Lisa Marie. 2019. Everyday Information Architecture. senschaften. Wiesbaden: Springer. New York: A Book Apart. Broughton, Vanda. 2019. Facet Analysis. London: Facet. McLeish, Simon, ed. 2019. Getting Resource Discovery Right for Byström, Katriina, Jannica Heinström and Ian Ruthven, eds. Your User Community. London: Facet. 2019. Information at Work: Information Management in the Morris, Anthony. 2019. Emerging Issues in Academic Library Workplace. London: Facet. Cataloging & Technical Services. New York: Primary Re- Craven, Jenny and Paul Levay, eds. 2019. Systematic Searching. search Group. London: Facet. Schopflin, Katharine and Matt Walsh. 2019. Practical Deppert, Wolfgang. 2019. Die Verantwortung der Wissenschaft. Knowledge and Information Management. London: Facet. Vol. 4 of Theorie der Wissenschaft. Wiesbaden: Springer. Smith, Nicholas D., ed. 2019. Knowledge in Ancient Philosophy. Galitsky, Boris. 2019. Developing Enterprise Chatbots: Learning London: Bloomsbury Academic. Linguistic Structures. Cham: Springer. Hoffman, Gretchen L. 2019. Organizing Library Collections: Theory and Practice. Lanham: Rowman & Littlefield.

Knowl. Org. 46(2019)No.4

KNOWLEDGE ORGANIZATION KO

Official Journal of the International Society for Knowledge Organization ISSN 0943 – 7444 International Journal devoted to Concept Theory, Classification, Indexing and Knowledge Representation

Publisher Examples of classification arrays should be configured as figures and set into the document as jpgs; they should not be entered as editable text. Ergon – ein Verlag in der Nomos Verlagsgesellschaft mbH Remove all active hyperlinks, including those from reference formatting software (if Waldseestraße 3-5 hovering over the text with a mouse produces a gray highlight, the text is hyperlinked; D-76530 Baden-Baden remove the link “Insert,” “Hyperlink,” “Remove link”). Tel. +49 (0)7221-21 04-667 Reference citations within the text should have the form: (Author year). For example, Fax +49 (0)7221-21 04-27 (Jones 1990). Specific page numbers are required for quoted material, e.g. (Jones 1990, Sparkasse Baden-Baden Gaggenau 100). A citation with two authors would read (Jones and Smith 1990); three or more au- IBAN: DE05 6625 0030 0005 0022 66 thors would be: (Jones et al. 1990). When the author is mentioned in the text, only the date BIC: SOLADES1BAD and optional page number should appear in parentheses: “According to Jones (1990), …” or “Smith wrote (2010, 146): ….” A subsequent page reference to the same cited work (e.g., to Smith 2010) should have the form “(229).” There is never a comma before the Editor-in-chief (Editorial office) date. In-text citations should not be routinely placed at the end of a sentence or after a KNOWLEDGE ORGANIZATION quotation, but an attempt should be made to work them into the narrative. For example: Journal of the International Society for Knowledge Organization Richard P. Smiraglia, Editor-in-Chief “Jones (2010, 114) reported statistically significant results. [email protected] “Many authors report similar data; according to Matthews (2014, 94): “all seven stud- ies report means within ±5%.”

Instructions for Authors In-text citations should precede block quotations, and never are placed at the end of a block-quotation. Manuscripts should be submitted electronically (in Microsoft® Word format) in English References should be listed alphabetically by author at the end of the article. Refer- only via ScholarOne at https://mc04.manuscriptcentral.com/jisko. Manuscripts that do ence lists should not contain references to works not cited in the text. Websites mentioned not adhere to these guidelines will be returned to the authors for resubmission in proper in passing in the text should be identified parenthetically with their URLs but not with form. references unless a specific page of a specific website is being quoted. Manuscripts should be accompanied by an indicative abstract of approximately 250 Author names should be given as found in the sources (not abbreviated, but also not words. Manuscripts of articles should fall within the range 6,000-10,000 words. Longer fuller than what is given in the source). Journal titles should not be abbreviated. Multiple manuscripts will be considered on consultation with the editor-in-chief. citations to works by the same author should be listed chronologically and should each A separate title page should include the article title and the author’s name, postal ad- include the author’s name. Articles appearing in the same year should have the following dress, and E-mail address. Only the title of the article should appear on the first page of format: “Jones 2005a, Jones 2005b, etc.” the text. Contact information must be present for all authors of a manuscript. Proceedings must be identified fully by title, editor, and details of publication. To protect anonymity, the author’s name should not appear on the manuscript. Journal issue numbers are given only when a journal volume is not through-paginated. Criteria for acceptance will be appropriateness to the field of knowledge organization References for published electronic resources should be accompanied by either a URL or (see Scope and Aims), taking into account the merit of the contents and presentation. It DOI but not in lieu of actual publication data; access dates are not allowed. is expected that all successful manuscripts will be well-situated in the domain of Unpublished electronic resources may use an access date in lieu of a data of publica- knowledge organization, and will cite all relevant literature from within the domain. Au- tion. In cases of doubt, authors are encouraged to consult The Chicago Manual of Style 17th thors are encouraged to use the KO literature database at http://www.isko.org/lit.html. ed. (or online), author-date reference system (chapter 15). The manuscript should be concise and should conform to professional standards of English usage and grammar. Authors whose native language is not English are encouraged Examples: to make use of professional academic English-language proofreading services. We recom- mend Vulpine Academic Services ([email protected]). Dahlberg, Ingetraut. 1978. “A Referent-Oriented, Analytical Concept Theory for INTER- Manuscripts are received with the understanding that they have not been previously CONCEPT.” International Classification 5: 142-51. published, are not being submitted for publication elsewhere, and that if the work received Howarth, Lynne C. 2003. “Designing a Common Namespace for Searching Metadata- official sponsorship, it has been duly released for publication. Submissions are refereed, Enabled Knowledge Repositories: An International Perspective.” Cataloging & Classi- and authors will usually be notified within 6 to 8 weeks. fication Quarterly 37, nos. 1/2: 173-85. Under no circumstances should the author attempt to mimic the presentation of text Pogorelec, Andrej and Alenka Šauperl. 2006. “The Alternative Model of Classification of as it appears in our published journal. Instead, please follow these instructions. Belles-Lettres in Libraries.” Knowledge Organization 33: 204-14. In Microsoft® Word please set the language preference (“Tools,” “Language”) to Schallier, Wouter. 2004. “On the Razor’s Edge: Between Local and Overall Needs in “English (US)” or “English (UK).” Knowledge Organization.” In Knowledge Organization and the Global Information Society: The entire manuscript should be double-spaced, including notes and references. Proceedings of the Eighth International ISKO Conference 13-16 July 2004 London, UK, edited The text should be structured with decimally-numbered subheadings (1.0, 1.1, 2.0, by Ia C. McIlwaine. Advances in knowledge organization 9. Würzburg: Ergon Verlag, 2.1, 2.1.1, etc.). It should contain an introduction, giving an overview and stating the pur- 269-74. pose, a main body, describing in sufficient detail the materials or methods used and the Smiraglia, Richard P. 2001. The Nature of ‘a Work’: Implications for the Organization of Know- results or systems developed, and a conclusion or summary. ledge. Lanham, Md.: Scarecrow. Author-generated keywords are not permitted. Smiraglia, Richard P. 2005. “Instantiation: Toward a Theory.” In Data, Information, and Footnotes are not allowed. Endnotes are accepted only in rare cases and should be Knowledge in a Networked World; Annual Conference of the Canadian Association for Infor- limited in number; all narration should be included in the text of the article. Do not use mation Science … London, Ontario, June 2-4 2005, ed. Liwen Vaughan. http://www.cais- automatic footnote formatting. Instead, insert a superscript numeral (Format, Font, Su- acsi.ca/2005proceedings.htm. perscript) and create the text of the note manually in a separate list at the end of the manuscript, before the reference list. Upon acceptance of a manuscript for publication, authors must provide a digital photo Paragraphs should include a topic sentence, a developed narrative and a conclusion; and a one-paragraph biographical sketch (fewer than 100 words). The photograph a typical paragraph has several sentences. Paragraphs with tweet-like characteristics (one should be scanned with a minimum resolution of 600 dpi and saved as a .jpg file. or two sentences) are inappropriate. Italics are permitted only for phrases from languages other than English, and for the titles of published works. Bold type is not permitted. © Ergon – ein Verlag in der Nomos Verlagsgesellschaft, Em-dashes should not be used as substitutes for commas. Dashes must be inserted Baden-Baden 2019. All Rights reserved. manually (Insert, Advanced Symbol, Em-dash) with no spaces on either side. Do not use automatic formatting of any kind. To indent, use the ruler. Do not use KO is published by Ergon. tabs under any circumstances. For a bulleted list, indent the list using the ruler, then insert bullets (Insert, Advanced Symbol, bullet). Do not use automatically-numbered paragraphs. Annual subscription 2019: Illustrations should be embedded within the document. Photographs (including color – Print + online (8 issues/ann.; unlimited access for your Campus via Nomos and half-tone) should be scanned with a minimum resolution of 600 dpi and saved as .jpg eLibrary) € 359,00/ann. files. Tables should contain a number and caption at the bottom, and all columns and rows – Prices do not include postage and packing should have headings. All illustrations should be cited in the text as Figure 1, Figure 2, etc. – Cancellation policy: Termination within 3 months‘ notice to the end of the cal- or Table 1, Table 2, etc. endar year Knowl. Org. 46(2019)No.4

KO KNOWLEDGE ORGANIZATION

Official Journal of the International Society for Knowledge Organization ISSN 0943 – 7444 International Journal devoted to Concept Theory, Classification, Indexing and Knowledge Representation

Scope Aims

The more scientific data is generated in the impetuous present times, the Thus, KNOWLEDGE ORGANIZATION is a forum for all those in- terested in the organization of knowledge on a universal or a domain- more ordering energy needs to be expended to control these data in a specific scale, using concept-analytical or concept-synthetical approaches, retrievable fashion. With the abundance of knowledge now available the as well as quantitative and qualitative methodologies. KNOWLEDGE questions of new solutions to the ordering problem and thus of im- ORGANIZATION also addresses the intellectual and automatic compi- proved classification systems, methods and procedures have acquired un- lation and use of classification systems and thesauri in all fields of foreseen significance. For many years now they have been the focus of knowledge, with special attention being given to the problems of termi- nology. interest of information scientists the world over. KNOWLEDGE ORGANIZATION publishes original articles, re- Until recently, the special literature relevant to classification was pub- ports on conferences and similar communications, as well as book re- lished in piecemeal fashion, scattered over the numerous technical jour- views, letters to the editor, and an extensive annotated bibliography of nals serving the experts of the various fields such as: recent classification and indexing literature. KNOWLEDGE ORGANIZATION should therefore be available philosophy and science of science at every university and research library of every country, at every infor- science policy and science organization mation center, at colleges and schools of library and information science, in the hands of everybody interested in the fields mentioned above and mathematics, statistics and computer science thus also at every office for updating information on any topic related to library and information science the problems of order in our information-flooded times. archivistics and museology KNOWLEDGE ORGANIZATION was founded in 1973 by an in- journalism and communication science ternational group of scholars with a consulting board of editors repre- industrial products and commodity science senting the world’s regions, the special classification fields, and the subject terminology, lexicography and linguistics areas involved. From 1974-1980 it was published by K.G. Saur Verlag, München. Back issues of 1978-1992 are available from ERGON-Verlag,

too. Beginning in 1974, KNOWLEDGE ORGANIZATION (formerly IN- As of 1989, KNOWLEDGE ORGANIZATION has become the TERNATIONAL CLASSIFICATION) has been serving as a common official organ of the INTERNATIONAL SOCIETY FOR KNOW- platform for the discussion of both theoretical background questions LEDGE ORGANIZATION (ISKO) and is included for every ISKO- and practical application problems in many areas of concern. In each is- member, personal or institutional in the membership fee. sue experts from many countries comment on questions of an adequate Annual subscription 2019: Print + online (8 issues/ann.; unlimited structuring and construction of ordering systems and on the problems access for your Campus via Nomos eLibrary) € 359,00/ann. Prices do of their use in opening the information contents of new literature, of not include postage and packing. Cancellation policy: Termination within 3 months‘ notice to the end of the calendar year data collections and survey, of tabular works and of other objects of sci- entific interest. Their contributions have been concerned with Ergon – ein Verlag in der Nomos Verlagsgesellschaft mbH, Wald- seestraße 3-5, D-76530 Baden-Baden, Tel. +49 (0)7221-21 04-667, Fax

+49 (0)7221-21 04-27, Sparkasse Baden-Baden Gaggenau, IBAN: DE05 (1) clarifying the theoretical foundations (general ordering theory/ 6625 0030 0005 0022 66, BIC: SOLADES1BAD science, theoretical bases of classification, data analysis and reduc- Founded under the title International Classification in 1974 by Dr. tion) Ingetraut Dahlberg, the founding president of ISKO. Dr. Dahlberg (2) describing practical operations connected with indexing/classifi- served as the journal’s editor from 1974 to 1997, and as its publisher (In- cation, as well as applications of classification systems and the- deks Verlag of Frankfurt) from 1981 to 1997. sauri, manual and machine indexing The contents of the journal are indexed and abstracted in Social Sci- ences Citation Index, Web of Science, Information Science Abstracts, INSPEC, Li- (3) tracing the history of classification knowledge and methodology brary and Information Science Abstracts (LISA), Library, Information Science & (4) discussing questions of education and training in classification Technology Abstracts (EBSCO), Library Literature and Information Science (Wil- (5) concerning themselves with the problems of terminology in gen- son), PASCAL, Referativnyi Zhurnal Informatika, and Sociological Abstracts. eral and with respect to special fields.