Improving the Formatting Tools of CDS Invenio
Total Page:16
File Type:pdf, Size:1020Kb
Improving the Formatting Tools of CDS Invenio J´erˆome Caffaro CERN-THESIS-2009-057 20/10/2006 Supervisors: Jean-Yves Le Meur Dr. Pearl Pu Faltings IT-UDS-CDS, CERN I&C-IIF-HCI, EPFL October 1, 2006 Abstract CDS Invenio is the web-based integrated digital library system developed at CERN. It is a strategical tool that supports the archival and open dissemina- tion of documents produced by CERN researchers. This paper reports on my Master’s thesis work done on BibFormat, a module in CDS Invenio, which for- mats document meta-data. The goal of this project was to implement a completely new formatting module for CDS Invenio. In this report a strong emphasis is put on the user-centered design of the new BibFormat. The bibliographic formatting process and its re- quirements are discussed. The task analysis and its resulting interaction model are detailed. The document also shows the implemented user interface of BibFormat and gives the results of the user evaluation of this interface. Finally the results of a small usability study of the formats included in CDS In- venio are discussed. “Good tools obviate bad processes.” Dr. Michael B. Johnson, Pixar Animation Studios, Apple WWDC, August 2006 vi Contents 1 Introduction 1 1.1 Context . 1 1.1.1 CERN . 1 1.1.2 CDS Invenio . 1 1.1.3 BibFormat . 2 1.2 The Project . 4 1.2.1 Motivations . 4 1.2.2 Objectives . 4 1.2.3 Activities . 5 2 Analysis 7 2.1 Task Analysis . 7 2.1.1 Product Statement . 7 2.1.2 User Population Analysis . 7 2.1.3 Personas . 11 2.1.4 User Needs . 17 2.1.5 User Scenarios . 18 2.1.6 Task Tables and Task Map . 19 2.1.7 Usability Goals . 24 2.2 Comparative Analysis . 24 2.2.1 Previous Formatting Module . 25 2.2.2 Competitors . 30 2.2.3 Synthesis . 34 3 Design 37 3.1 Specifications . 37 3.2 Prototypes . 40 3.3 Final UI Design . 44 3.4 Formatting Engine Design . 51 4 Evaluation 57 4.1 User Evaluation of the BibFormat Admin User Interface . 57 4.2 Usability Study of the Formats . 58 4.2.1 Goals of the Study . 59 4.2.2 Experimental Conditions . 60 4.2.3 Results . 61 4.3 Further Improvements of BibFormat . 62 vii CONTENTS 5 Conclusions 65 Appendices A UML Use Cases 69 B BibFormat Developers APIs 73 Bibliography 79 viii Chapter 1 Introduction 1.1 Context 1.1.1 CERN CERN (European Organization for Nuclear Research) is the world’s largest particle physics center[1]. Located near Geneva on the Frenco-Swiss border, it employs 3000 persons to operate, maintain and develop the complex facilities that allows scientists to discover the building blocks of matter. Founded in 1954, the laboratory was one of Europe’s first joint ventures and includes now 20 Member States. Some 6500 visiting scientists, half of the world’s particle physicists, come to CERN for their research. They represent 500 universities and over 80 nationalities. Among the important discoveries for which many CERN scientists have re- ceived prestigious awards, including Nobel prizes, is the Web. The Web was first thought by Tim Berners-Lee in 1989 as a solution to the problem of loss of in- formation due to the high turnover of people at CERN. His original proposal[2] suggested to use Hypertext to link information systems accross network such that anyone could have access to any important documents produced at CERN. The web has now extended worldwide and has become a primary component of IT. Unfortunately some concepts proposed by Tim Berners-Lee have not made their way into the world wide web: there is for example no typed links (' Semantic web) between information systems, and there is still not always a clear separation between the data and its visual representation. 1.1.2 CDS Invenio CDS Invenio1 is the integrated digital library system developed and used at CERN[3]. It is in some ways the implementation of the original idea of Tim Berners-Lee for scientific documents. As CERN researchers issue about 2000 publications per year, CDS Invenio is an essential tool to capture all of this information and turn it into shared knowl- edge. At CERN CDS Invenio currently has a database of more than 800’000 bibliographic references, including 360’000 fulltext documents[4]. These docu- ments are organized in more than 500 collections. In additions to the documents 1Formerly known as CDSware 1 CHAPTER 1. INTRODUCTION produced at CERN, CDS Invenio harvests about 100’000 documents from ex- ternal sources. CDS Invenio uses standards at its core. It is OAI-compliant[5] to support the open dissemination of these documents. CDS Invenio also uses MARC 21[6] (and its XML derivative MARCXML) to store and process bibli- ographic meta-data. The server is accessible from a web-based interface, which offers both a fast search of documents and a browsable tree-like structure. Collections of docu- ments can have customizable “portals” to support community building. Addi- tionally, CDS Invenio provides collaborative tools such as baskets or alerts for new specific documents [7]. CDS Invenio is a free, open source application (under the terms of the GNU Gen- eral Public License). It is developed at CERN by the CERN Document Server (CDS) section, in the User and Document Services (UDS) group. CDS Invenio is now being used by many scientific institutions worldwide, including EPFL[8]. The software complements other librarians’ tools such as Aleph 500[9], with which CDS Invenio can synchronize. 1.1.3 BibFormat BibFormat is one of the many modules that compose CDS Invenio. Its purpose is to format the records of the database by applying customizable templates. The BibFormat module is almost invisible to end-users of CDS Invenio, in the sense that users only see the formatted output produced by BibFormat, but do not directly interact with the module. Figure 1.1 shows where BibFormat sits in the case of a record access by a user. This workflow is only a conceptual view of CDS Invenio Request for Search Request for formatting document(s) query a record Web search Search BibFormat interface Engine Formatted Formatted Formatted User output record(s) record Records Meta-data Figure 1.1: Simplified scenario of typical access to a record by end-user. the process, as other mechanisms such as caching of pre-formatted records are involved. This scenario repeats for each search request of the user as BibFormat is also responsible for formatting each individual notice of the search results list. Figure 1.2 shows the output produced by BibFormat in the case of a search results displaying and in the case of a detailed view of a record. Notice that BibFormat only produces the red surrounded areas, while other components are laid out by CDS Invenio itself. 2 guest :: session :: alerts :: baskets :: login Search Submit Convert Agenda Webcast Bulletin Library Home > Search Results CERN Document Server Search: einstein any field Search Browse Search Tips :: Advanced Search :: Try your search on... Search collections: *** any collection *** Sort by: Display results: Output format: - latest first - 10 results HTML brief desc. - or rank by - split by collection 1.1.Results overview: CONTEXT Found 13,092 records in 0.56 seconds. Articles & Preprints, 12,433 records found guest :: session :: alerts :: baskets :: login Books & Proceedings, 337 records found Search Submit Convert Agenda Webcast Bulletin Library Presentations & Talks, 52 records found Periodicals & Progress Reports, 4 records found Home > Articles & Preprints > Preprints > Detailed record #981978 Multimedia & Outreach, 187 records found Archives, 79 records found Format: HTML | HTML MARC | XML DC | XML MARC Articles & 12,433 records found 1 - 10 jump to Preprints Other Fields of Physics physics/0609026 Preprints record: 1 On the Machian Origin of Inertia 1. Why Does Gravity Ignore the Vacuum Energy? / Padmanabhan, T 3 pages including front one The equations of motion for matter fields are invariant under the shift of the matter lagrangian by a constant. [...] gr-qc/0609012; 5 Sep 2006 Fulltext Berman, M S; Detailed record - Similar records 3 Sep 2006 . - 3 p 2. On the Machian Origin of Inertia / Berman, M S We determine the numerical value of a constant which appears in the Machian inertial force expression devised by Graneau and Graneau[2]. [...] physics/0609026; 3 Sep 2006 . - 3 p Fulltext Detailed record - Similar records Abstract: We determine the numerical value of 3. Large atom number Bose-Einstein condensate of sodium / a constant which appears in the Machian van der Stam, K M R; van Ooijen, E D; Meppelink, R; inertial force expression devised by Graneau Vogels, J M; Van der Straten, P and Graneau[2]. We point out that this formula We describe the setup to create a large Bose-Einstein condensate containing may be not restricted to Newtonian physics. more than 120x10^6 atoms. [...] Keywords: Einstein; Brans-Dicke; Newton; physics/0609028; 4 Sep 2006 . - 11 p Fulltext Gravitation; Graneau and Graneau; Mach. Detailed record - Similar records PACS: 01.55.+b; 04.20.-q. 4. Extended Coherence Time with Atom-Number Squeezed Sources / Li, W; Tuchman, A K; Chien, H C; Kasevich, M A Coherence properties of Bose-Einstein condensates offer the potential for improved interferometric phase contrast. [...] Access to fulltext document quant-ph/0609009; 2 Sep 2006 . - 4 p Fulltext Detailed record - Similar records 5. Some analytical models of radiating collapsing spheres / You can download this document in the following formats: Herrera, L; Prisco, A D; Ospino, J Portable Document Format - [51984 bytes] We present some analytical(a) solutions Search to the Einstein results equations, describing (b) Detailed record output radiating collapsing spheres in the diffusion approximation.