Information Journeys in Digital Archives Joseph Jonathan Pugh
Total Page:16
File Type:pdf, Size:1020Kb
Information journeys in digital archives Joseph Jonathan Pugh EngD University of York Computer Science September 2017 Abstract Archival collections have particular properties that make physical and intellectual access difficult for researchers. This generates feelings of uncertainty in the researchers leading to a large burden of enquiries to the archive, many routine. In this thesis I investigate the information seeking behaviours of archival researchers and the distinct properties of the archive first through the respective literatures and then through a series of five studies. Using systems, data and researchers from the National Archives, these studies examine the nature of the enquiries archives receive across many channels, the in-person interactions between archivists and researchers in the reading rooms and the unmediated search behaviours of archival researchers. I proceed to outline the barriers inhibiting research progress and the techniques or 'regulators' used by researchers to surmount or mitigate these barriers. In the final two studies I develop and attempt to validate an instrument for measuring uncertainty in information seeking in large digital collections. This three factor (disorientation, prospect and preparedness) scale of archival uncertainty allows improvements to online archival systems to be effectively tested before implementation. I also propose system properties which seem likely to assist researchers to make progress given these factors and which could be tested using this instrument. 2 Contents List of figures .............................................................................................................. 7 List of tables ............................................................................................................... 9 Acknowledgements .................................................................................................. 10 Author’s declaration.................................................................................................. 11 Chapter 1: Introduction ............................................................................................. 12 1.1 The best place to hide a book ......................................................................... 12 1.2 Defining the archive ........................................................................................ 17 1.2.1 What is an archive? .................................................................................. 17 1.2.2 Digital archive or digital library? ................................................................ 17 1.2.3 Information seeking or information retrieval? ............................................ 18 1.4 Research goals ............................................................................................... 20 1.5 Research context ............................................................................................ 21 1.5.1 The National Archives as industrial sponsor ............................................. 21 1.6 Thesis structure and contributions .................................................................. 23 Chapter 2: Information seeking, an overview ........................................................... 26 2.1 Introduction ..................................................................................................... 26 2.2 The search for relevance ................................................................................ 27 2.3 Information needs and behaviours .................................................................. 28 2.3.1 Satisficing ................................................................................................. 28 2.3.2 Browsing ................................................................................................... 29 2.3.3 Searching.................................................................................................. 30 2.3.4 Exploratory search .................................................................................... 33 2.4 Modelling information behaviour ..................................................................... 35 2.4.1 Classic models .......................................................................................... 35 2.4.2 Behavioural models .................................................................................. 39 2.4.3 Contextual models .................................................................................... 43 2.4.4 Cognitive path models .............................................................................. 50 2.4.5 Macro models ........................................................................................... 55 2.4.6 Do we need better models? ...................................................................... 57 2.5 Sensemaking .................................................................................................. 58 3 2.5.1 Brenda Dervin ........................................................................................... 59 2.5.2 Russell, Pirolli and Card............................................................................ 60 2.5.3 Klein, Moon and Hoffman ......................................................................... 65 2.5.4 Limitations of sensemaking....................................................................... 67 2.6 Information foraging and scent ........................................................................ 69 2.7 Conclusion ...................................................................................................... 71 Chapter 3: The Trouble with Archives ...................................................................... 72 3.1 Introduction ..................................................................................................... 72 3.2 Lost in the stacks ............................................................................................ 72 3.3 Archives and usability ..................................................................................... 75 3.3.1 Guardians and gatekeepers...................................................................... 75 3.3.2 Mediators and machines ........................................................................... 77 3.3.3 There are no jokes about archives ............................................................ 80 3.4 Uncertainty ...................................................................................................... 82 3.5 Properties of the Discovery system ................................................................. 84 3.6 A short study of Discovery queries .................................................................. 87 3.6.1 Method ...................................................................................................... 88 3.6.2 Results ...................................................................................................... 90 3.6.3 Discussion ................................................................................................ 92 Chapter 4: Swimming the Channels ......................................................................... 94 4.1 Introduction ..................................................................................................... 94 4.2 Archival enquiries ............................................................................................ 95 4.2 Method ............................................................................................................ 96 4.2.1 Data collection .......................................................................................... 96 4.2.2 Content Analysis ....................................................................................... 98 4.2.3 Grounded Theory .................................................................................... 100 4.3 Results ...................................................................................................... 100 4.3.1 Content Analysis ..................................................................................... 100 4.3.2 Grounded Theory .................................................................................... 103 4.3.3 Archivist as Google ................................................................................. 104 4.3.4 Ability to survey the terrain ...................................................................... 105 4.3.5 Seeking reassurance .............................................................................. 107 4.4 Discussion ..................................................................................................... 108 4 4.5 Limitations ..................................................................................................... 109 4.6 Conclusion .................................................................................................... 110 Chapter 5: “The question everybody wants to know” ............................................. 112 5.1 Introduction ................................................................................................... 112 5.2 The Reference Interview ............................................................................... 113 5.3 Methodology ................................................................................................. 115 5.3.1 Data collection ........................................................................................ 115 5.3.2 Method of analysis .................................................................................