Federal Department of Home Affairs FDHA Swiss Federal Office of Culture FOC Swiss NL

The ’s e-Helvetica Project1

Elena Balzardi Collections Department Head and e-Helvetica General Project Manager Swiss National Library

Electronic Information: Long-term Archiving

Just a few years ago, the idea that archival libraries should take over the long-term preserva- tion of portions of the information published on the Internet would have elicited incredulous head-shaking or compassionate smiles from people. Almost always, the same arguments cropped up: either that the project would be impossible to carry out, or that it would not be of the slightest interest or relevance to future generations. After all, the very charm of the Inter- net lies in its transience. Such a project would, moreover, be technologically impossible.

Such doubts notwithstanding, for a couple of years now various noted institutions have been working on projects – and the applications these have spawned – in connection with the preservation of electronic information published on the Internet. The demands upon libraries with respect to electronic information can be classed into three categories:

1. Availability: Access to electronic information should be guaranteed at the least possible expense for each institution’s target and interest groups. 2. Archiving: Availability should also be guaranteed even when the information can no longer be accessed on the Internet either for a fee or free of cost. 3. Digitization: Information existing on analogue platforms should be made available to a broad public on the Internet by being equipped with the most user-friendly search options of the day.

Even today, the idea of preserving portions of information published on the Internet continues to inspire certain doubts. Basically, however, given the impressive spread of Internet usage and the greater awareness of the importance of preserving “memory-objects”, the meaning- fulness of archiving Internet publications is no longer in doubt. What does arouse scepticism – and perhaps rightly so – is the question of feasibility.

The Swiss National Library’s Activities

The Swiss National Library is the main source worldwide for written material on Switzerland and the Swiss. Its mission is to “collect, index, preserve and disseminate any material con- nected with Switzerland, [whether] in printed form or stored on other information platforms”2 (Helvetica). Signs of a change in the information market had already appeared at the time the law was drawn up. The law of 1992 no longer stresses the type of platform but only the

1 First published in Arbido “Elektronisches Publizieren – Informationsspezialisten als Mittler zwischen zwei Welten“ (Electronic Publishing – Information Specialists as Intermediaries between Two Worlds), Issue 4 of 14 December 2006 2 Federal law on the Swiss National Library of December 18, 1992

required relation to Switzerland. With the widespread rise of the Internet since 1993, the Swiss National Library began addressing the question of how, in the future, electronic publi- cations could be stored on a long-term basis and thus preserved for subsequent consulta- tion. It first set up various pilot projects, before starting the e-Helvetica project in 2001.

The goal of the e-Helvetica project3 is to establish the foundations for the collecting, indexing, archiving and disseminating of electronic off- and online Helvetica. Offline Helvetica are me- dia such as diskettes, CD-ROMs or DVDs. Online Helvetica are Internet publications. The Project Manual outlines the project’s structuring and defines its basic aims and time sched- ule. A cost estimate for the project and its subsequent operational phase is also being drawn up.

As a federal institution, the Swiss National Library is governed by federal law. Thus, for in- stance, it derives its IT services from an internal federal service provider. The possibilities for potential synergies with other federal institutions are being explored, with an eye to establish- ing an effective mode of collaboration. In its role as a public institution, the procurement of goods and services must also comply with the public procurement policy rules. And because it is Swiss, the Swiss National Library endeavours – especially within the framework of the e- Helvetica project – to collaborate above all with other Swiss libraries, on cantonal and uni- versity levels. On a national level, it has long-established contacts with the Swiss book and information market. Internationally, it works mainly with other national libraries entrusted with comparable mandates.

The e-Helvetica project is divided into two sub-sections. The sub-section “Organization” deals with the librarianship aspects of the e-Helvetica Collection – that is, determining the Collection’s contents, their indexing and their dissemination. The sub-section “Archiving” covers the e-Helvetica Collection’s IT aspects: the setting up of the technological foundations and the IT application for preparing and storing the Collection. The project is being devel- oped by specialists in library science as well as IT experts. There are seven project staff sharing a 3.1 FTE.

According to the project schedule as well as the project’s current status, the e-Helvetica pro- ject should be completed and operational by the end of 2008.

Contents: Pilot Projects

The sub-section Organization deals with the project’s librarianship aspects. The Swiss Na- tional Library has decided in favour of building up a selective collection of electronic publica- tions. Offline media such as diskettes, CD-ROMS and DVDs are to be collected to the fullest possible extent. Internet publications (online media) are to be selectively collected in as broadly representative a range as possible. To date, no uniform international practice has been established with respect to the collection of electronic cultural artefacts by national li- braries. Several national libraries use Web harvesting to collect as large a portion as possible of the Web sites published in their own countries, whereas other national libraries favour building up a selective collection. The Swiss National Library has chosen to develop a selec- tive collection for legal reasons (copyright), economic reasons (the expansion and storage of the Collection), and resource reasons (the planning and carrying out of long-term preserva- tion).

Within the framework of a selectively built up Collection, a broad foundation is foreseen. In order to establish the Collection and carry out the procedural steps connected with this un- dertaking, publications from a selection of publishers are to be collected. The implementation

3 See : www.e-helvetica.admin.ch

2/6

phases are being planned in three and, as of 2007, four pilot projects and linked to the library tasks for the IT procedures.

The pilot project “POP” (pilot project for transferring and archiving online commercial publica- tions) collects commercial publications belonging to the longstanding Swiss publishers Kar- ger (Basel)4 and Stämpfli ()5. The Swiss National Library thus continues the printed Col- lection that has been steadily expanded since the 1920s. In the main, the Collection com- prises online periodical titles that are published in a specific format, and it is being built up in close collaboration and agreement with the publishers.

The “e-Diss.ch” pilot project focuses on collecting electronic doctoral and post-doctoral the- ses from Swiss universities. The Collection is being developed thanks to the coordinated efforts of the Swiss National Library and the university libraries, with the support of the “Con- ference of University Libraries of Switzerland” (KUB). The formats are generally in confor- mance with familiar standards. The authors are members of Swiss universities.

The pilot project “Web Archive Switzerland” deals with the collection of non-commercial Inter- net sites related to Switzerland’s geography, history and institutions. The Collection is being developed in close collaboration with the Swiss cantonal libraries, who are responsible for the selection of publications to be collected. Contrary to the “POP” and “e-Diss.ch” projects, the pilot project “Web Archive Switzerland” features mixed formats.

A new pilot project for collecting official electronic documents published by the Swiss federal government will start in 2007.

All the electronic online Helvetica collected by the Swiss National Library are given a basic catalogue entry. Their storage and archiving will incorporate them into the Collection defini- tively.

Collaboration with other institutions is a mainstay of all the pilot projects. Our partners in this undertaking are various publishing houses, information producers, university libraries, can- tonal libraries and administrations.

Technology : IT-Projects

The archiving system is being set up according to the directives of the generic model “Open Archival Information System (OAIS)”.6 Adopted as ISO 14721, this reference model de- scribes an archive as an organization where people and systems work together on the prob- lem of preserving information data and making it available to a defined user group. The model gives a detailed description of how Producer-issued electronic information should be integrated into an archival system, what preparatory steps must be undertaken for long-term archiving, and how information stored in the archive can be accessed.

4 www.karger.com 5 www.staempli.ch 6 http://public.ccsds.org/publications/archive/650x=b1.pdf

3/6

PRESERVATION PLANNING

Descripive DATA Info MANAGEMENT Descripive P Info C R O queries O N result sets D INGEST S U ACCESS orders U C SIP M E E ARCHIVAL DIP R AIP R AIP STORAGE

ADMINISTRATION

OAIS-Referenzmodell, ISO 14721

Basically, the model determines how a Producer-issued object (SIP=Submission Information Package) is incorporated into the archiving system. Subsequent to its incorporation, it is transformed into an Archival Information Package (AIP) and filed in Archival Storage. The object’s management is effected by the Data Management module. The archived object (AIP= Archival Information Package) is then delivered to the users, via the “Access” user module, in compliance with various legal restrictions, and in the form of a user object (DIP= Dissemination Information Package). Long-term preservation in the form of the migration and emulation of objects is planned in the Preservation Planning module. Administration of the archiving system as a whole is carried out by the Administration module. The sequence of operations is basically the same as the classical sequence of operations for a library with printed holdings. However, in developing and preserving an electronic collection, the difficulty lies with the objects to be archived, which require several directly interdependent component parts in order to be readable (hardware, operating system, program, publication).

The Swiss National Library is building up its archiving system for electronic Helvetica based on the directives of the OAIS model. The “Archival Storage” module is being handled with an eye to creating synergies within the Federal Administration, in close collaboration with the Swiss Federal Archive. Other modules – for instance, the “Ingest” or “Access” modules – must be drawn up according to requirements and on the basis of the library collection‘s ob- jects and access possibilities. The “Ingest” module has just been put into operation. The electronic publications will – depending on the producer – be deposited in various ways in Archival Storage. An Access module for ensuring access is scheduled for 2007 and 2008.

The Swiss National Library’s metadata format is based on the standard developed by the Library of Congress: METS (Metadata Encoding & Transmission Standard, Version 1.3)7. MARCxml is used for the bibliographic metadata. For technical metadata, the Preservation Metadata of the National Library of New Zealand (July 2003 Version) applies.

A URN (Uniform Resource Name) is a persistent identifier – that is, a stable addressing sys- tem. Persistent identifiers can replace URLs (Uniform Resource Locator = Internet “link”) in

7 http://www.loc.gov/standards/mets

4/6

the catalogue or other detection systems or be applied to the document itself as stable refer- ences. This renders links permanent and reliable. The maintenance cost for actualizing ref- erences is lowered, since URLs only need to be automated at a few spots. The links can be integrated into several detection systems. Digital publications have a worldwide defined iden- tifier and can thus be reliably quoted8.

A URN ensures long-term access to an object. Long-lasting access is guaranteed by long- term archiving or object-based archiving as well as the technological high availability of the URN service. A URN refers to at least one URL by which an object can be addressed. A URN can also manage several copies of the same objects, that is URLs, as well as different presentation formats of the objects.

The IT-projects are also being carried out in close collaboration with other institutions. The “Archival Storage” module is being developed in co-operation with the Swiss Federal Ar- chive, and is open to use by both. The remaining modules are being developed with external companies. Great advances have been made in the sharing of standards. Thus, for example, the has put its URN-Resolving-Service at the disposal of the Swiss National Library and its partner institutions engaged in the same or similar realms of activity. This will help avoid the development of different formats or standards.

What Lies Ahead?

For the time being, the projects and applications for the long-term preservation of major digi- tal sources are still under construction. Due to all the different approaches and the diversity of collection contents, it is highly probable that at least part of the mass of relevant informa- tion will be preserved and made available for future generations.

As in the case of printed collections, long-lasting availability entails costs that, for now, can only be estimated. Memory institutions such as national or other archival libraries must con- cern themselves with the economic issue as well as the collection’s development.

In the same fashion as for the preservation of paper, various trends to do with the conserva- tion of digital data will no doubt crop up over time. It is therefore important for the original objects to be preserved to the greatest possible extent, so that, depending on the technologi- cal progress of the day, suitable conservation measures can be adopted.

The main thing is that it is the task of libraries to make information accessible and to pre- serve it. Writing has been one of the most important elements of communication and under- standing for centuries. It enables us to understand the past and to develop into the future. The form that writing takes, and the supports and technologies to which it is linked, have something to say about the society conveying the information. It must remain available for future generations.

Links, further details

Information about the e-Helvetica project and several other projects for the archiving of elec- tronic publications:

The e-Helvetica Project, Swiss National Library (Switzerland) www.e-helvetica.admin.ch

Nestor – Network of expertise in long-term archiving ()

8 www.persistent-identifier.de/?link=520

5/6

www.langzeitarchivierung.de

Austrian National Library, Long-term archiving (Austria) www.onb.ac.at/about/lza/

PADI – Preserving Access to Digital Information (Australia) www.nla.gov.au/padi

Internet Archive (USA) www.archive.org/index.php

Digital Preservation, Library of Congress (USA) www.digitalpreservation.gov

Digital Preservation Coalition (Great Britain) www.dpconline.org/graphics

contact: [email protected]

6/6