Preserving Large-Scale Cultural Heritage: the Case for Collaboration

Total Page:16

File Type:pdf, Size:1020Kb

Preserving Large-Scale Cultural Heritage: the Case for Collaboration Preserving large-scale cultural heritage: the case for collaboration Anne-Marie Schwirtlich, Director-General, National Library of Australia It is a great honour and pleasure to be addressing you today and I pass on the greetings of Australian colleagues. Australia has been a keen observer and participant in CONSAL meetings since the very first conference in Singapore in 1970. How far we have all come since then! Introduction The large-scale loss of cultural heritage materials can happen for a number of reasons. It can happen when important cultural heritage content is lost to society because it was never collected and stored by the appropriate institution, whether library, archive, museum or gallery. It can happen when institutions themselves fail to take appropriate care of their collections which become degraded and inaccessible. It can happen when even the most careful attention cannot prevent the natural degradation of physical materials caused by time itself, such as the crumbling pages of thread-bound books or brittle newsprint. Cultural heritage is also lost when collections are damaged and destroyed by natural disasters, such as earthquake, tsunami or fire. [slides 2,3,4] These events are beyond our control, although much can be done to protect our buildings and collections from the risk of natural disaster. Worst of all perhaps [slide 5], cultural heritage is lost at times of war and conflict due to looting and deliberate destruction, as we are seeing now in some parts of the world. Tragically, we are powerless to prevent this. What can heritage collecting institutions do, then, to preserve their collections on a large scale? My presentation focuses on two strategies. Firstly, large-scale collecting to protect digital heritage materials that are outside library collections and in danger of loss, namely web-based publications and websites. And, secondly, 1 employing large-scale digitisation to preserve and make accessible fragile collection content, focusing on historical Australian newspapers. Overall, I want to emphasise the role of collaboration and cooperation in the successful achievement of these important tasks. Setting the scene At the National Library of Australia, we build and manage a set of rapidly growing and complex digital collections. [slide 6] In June 2015 these collections comprised about 3,500 terabytes of data. [slide 7] They include archived copies of Australian websites; digitised copies of historic Australian newspapers; digitised copies of oral history and other audio files; digitised copies of analogue collection items such as pictures, music scores, maps, and manuscripts, and a small collection of digital photographs and personal archives. In addition to the digital collections we also build and manage physical collections of over 6 million volumes, housed in a heritage building in Canberra and two off-site storage repositories. Our staff numbers are around 410 Full Time Equivalents, and have been reducing each year in line with government budget requirements. It is therefore critical for us to make the best use of our scarce resources as we take on the ever-growing demands of both digital and physical collection management and preservation. So why is collaboration such an important principle? To explain this, I’d like to share with you a comment made by the International Internet Preservation Consortium some years ago about the obstacles to preserving the web. [Slide 8] The Consortium noted that: The task is too large for individual institutions to undertake in isolation and the resources required for successful and sustained archiving are too great to make duplication of effort a tenable position. 2 We agree! Models for collaboration Thinking of digital collections, what types of collaboration are of the most use to us in the long term? Broadly speaking, we see four groupings where collaboration most fruitfully occurs [Slide 9]: Content custodians (such as national and research libraries, major archives, universities that maintain repositories of research outputs, and data archive centres) who are committed to long term preservation, including tackling the problem of obsolescence; Communities of practice and information exchange (standards bodies, digital preservation experts, relevant professional associations); Providers of services (such as infrastructure providers, software developers, registry services, identifier resolution services); and Capacity building organisations (such as development organisations, funders of research, curriculum developers, non-government organisations). Since we began to build our digital collections, we have found many opportunities to collaborate with institutions facing similar challenges, and with other stakeholders. Over the years these partners have included overseas national libraries, other Australian cultural institutions, university libraries, agencies developing relevant software, and standards bodies. Our experience has shown that collaborations between different groupings can have interesting and useful outcomes, such as a partnership between the National Library and a university-funded research project to develop a standard or pilot a new service. These have often provided us with new information and learning which has deepened our knowledge or informed our practice. Some collaborations within groupings, such as between the National Library and its state library counterparts in Australia have had long-lasting 3 strategic outcomes. The more successful of these have, in fact, been transformational, and I would like to look at two in detail. Web archiving The need to collect and retain a record of a society’s cultural expression on the web is now well understood. The medium is highly vulnerable, in respect to content at least. Content can disappear without trace. There is no physical artefact or remnant for later, retrospective, collecting. As an example, the following sites archived by the National Library through a selective permissions-based approach have now disappeared from the live web: [Slide 10] Most sites associated with the 2000 Sydney Olympic Games, including the official Sydney Organising Committee for the Olympic Games site. [Slide 11] the website of John Howard, while he was our Prime Minister between 1998 and 2007 [Slide 12] A number of e-journals including: o Digital Technology Law Journal (1999-2004) o Asian Linguistics & Language Teaching (2002-2006) o The Zeitgeist Gazette o Journal of Sports Marketing (1998-2001) [Slide 13] the APEC Australia 2007 website (2007) [Slide 14] the Aboriginal and Torres Strait Islander Commission website (2006) Treatynow.org (2002-2005) Forgotten Australians (2009) Most sites associated with the 1998 Australian Constitutional Convention and 1999 Australian Republic Referendum. Most sites associated with the Centenary of Federation (2001) Most campaign websites of Federal and State elections from 1996 onwards (e.g. jeff.com.au, Kevin07). 4 Online Australia – the Commonwealth Government’s first initiative to build online communities (1998-2000); also, GovOnline.gov.au (2002) and Culture.gov.au : Australia’s Culture Portal (2010). The Australian Firearms Buyback website (2000) The Paralysis Tick of Australia (2001-2002); and The Jabiluka Uranium Mine Blockade website (1999) As a publishing medium that more and more people can engage in, collecting web materials allows us to understand our society over time in ways that have not been so possible in the past – provided we collect and preserve the record. More and more ‘grey literature’ – that is, documents produced by entities that are not in the commercial business of publishing, that support and inform policy and research – is published online only because of the cheap and convenient means of publishing afforded by the web. Without web archiving we run the risk of allowing the proverbial ‘digital black hole’ to prevail in our cultural, social and intellectual memory. Never has there been larger-scale cultural heritage that urgently requires preservation. But how are we as library and collection institutions to deal with it? The PANDORA Project The need to cooperate with other collecting institutions to achieve effective results was very much the thinking behind the National Library’s development of the PANDORA web archive in 1996.[Slide 15] PANDORA is a selective web archive, prioritising content with a high research value while also ensuring we collect and preserve a broad sample of material representing the range of online culture and publication relating to Australia and Australians. One of the reasons for taking a selective approach is so that we can manage the negotiation of permissions so that all the content collected for the PANDORA Archive can be made freely available to the public. 5 Because of the high cost of selective web archiving, it makes sense for a lead agency (such as a national library) to develop both the expertise and the infrastructure for web archiving, and for other agencies to leverage off this investment. Accordingly, PANDORA is a collaborative activity, as the archive is built by the Australian state libraries and some other cultural institutions in addition to the National Library. This activity is an example of collaboration between content custodians. Today, 11 participants, including the National Library, jointly curate the PANDORA Archive. [slide 16] This includes all state and territory libraries, with the exception of the Australian Capital Territory library service and the state library of Tasmania.
Recommended publications
  • Annual Report to Partners 2016-2017
    Annual report to partners 2016-2017 Contents 1. PANDORA Participants working together 1.1 Consultation mechanisms 1.2 Reports 1.3 Adding value – notable collections 2. Growth of the PANDORA Archive 2.1 Size and annual growth of the PANDORA Archive 2.2 Statistics for annual participant contributions 3. Development of the Web Archive 3.1 Development of the ‘Trove web archive’ zone 3.2 Australian web domain harvest 3.3 Collecting Commonwealth Government online publications 4. Focus on users 4.1 User views of the PANDORA Archive 4.2 User views of the Australian Government Web Archive 4.3 Most viewed titles (websites) in the PANDORA Archive 5. Promoting the Archive 5.1 Presentations, representations and papers 5.2 Social media 6. Concluding summary 1 1. PANDORA participants working together PANDORA, Australia’s Web Archive (http://pandora.nla.gov.au/) is a selective archive of Australian online publications and websites which is built collaboratively by the National Library of Australia, all of the mainland state libraries, the Australian War Memorial, the Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) and the National Gallery of Australia. This report to contributing participants on activities and developments in the 2016- 2017 financial year is provided in accordance with the National Library’s obligation as stated in section 6.2 (k) of the Memorandum of Understand with participant agencies. 1.1 Consultation mechanisms The National Library continued to inform other PANDORA participants about the operation of PANDORA through an email discussion list, the PANDORA Wiki and a semi-regular newsletter distributed through email and the Wiki.
    [Show full text]
  • Operational Challenges and Innovation for National Web Archiving
    Operational Challenges and Innovation for National Web Archiving Paul Koerbin1 Abstract This paper takes a long-view perspective of the interaction of innovation and operational objectives in the development of a national web archiving program at the National Library of Australia. In looking at this program over its twenty year history it is possible to discern an approach that is based on pragmatic outcomes strategically focused on operational workflows and access. While this approach has served the Library in developing and sustaining one of the earliest and longest active web archiving programs in the world, early successful outcomes have also produced longer term implications and constraints for keeping the program vital and fit for the purpose of collecting content from a dynamic, changing and expanding web. This paper covers developments in the three pillars of web archiving – collecting, preservation and access – as well as issues associated with bringing substantial legacy data into a program focused on the future. It suggests that a pragmatic and operational focused program remains a viable context for innovation. Introduction It is a notable fact that the systematic archiving of the World Wide Web (the web) – or more precisely selective parts of the web – has now been underway for more than twenty years. Moreover, programs taking on the task of preserving the material published on the web were established a mere five years or so after the appearance 1 Assistant Director Web Archiving and Government Publications, National Library of Australia. 1 of the web itself as a public medium2. The establishment of strategic programs to preserve access to the web should, per se, be recognised as a significant innovation particularly in the context of the library world.
    [Show full text]
  • NLA Pandora Factsheet
    PANDORA, AUSTRALIA’S WEB ARCHIVE ACCESS http://pandora.nla.gov.au/ Titles in the Archive are accessible free of charge via is a selective archive containing copies of the Internet at http://pandora.nla.gov.au/ Most titles significant Australian online publications and are available to anyone, anywhere in the world, with an Internet connection. Access is restricted to a very web sites issued on the Internet. The National small proportion of titles, mainly for commercial Library of Australia and its partners are reasons, and these can be viewed on a single PC in building the Archive to ensure long-term the Library’s Main Reading Room. access to significant Australian documentary People can find out about titles that are in the Archive heritage that is published online. by searching partners’ online catalogues or by searching the National Bibliographic Database PANDORA was placed on the Memory of the (Libraries Australia). Access is provided via hotlinks in World Australian Register in August 2004. the catalogue record to the title in the Archive. Access is also available via subject and title lists on the PANDORA Web Site. Full-text searching is available PARTICIPANT AGENCIES using the Library’s single search discovery service Australian Institute of Aboriginal and Torres Strait Trove. Commercial search engines, such as Google Islander Studies and Yahoo!, index the Archive down to the level of Australian War Memorial individual titles, but not the Archive contents. National Film and Sound Archive National Library of Australia QUALITY ASSURANCE Northern Territory Library Significant effort is invested in ensuring the authenticity State Library of New South Wales and integrity of each title archived.
    [Show full text]
  • Annual Report to Partners 2019-2020
    Annual report to partners 2019-2020 Contents 1. PANDORA Participants working together 1.1 Consultation mechanisms 1.2 Reports 1.3 Notable PANDORA collections 2. Growth of the Web Archive 2.1 Size and annual growth of the PANDORA Archive 2.2 Statistics for annual participant contributions 3. Analysis 3.1 Individual partner contribution trends over the life of the archive 4. Development of the Web Archive 4.1 Development of PANDAS and tools supporting partners 4.2 The Australian Web Archive 4.3 Australian web domain harvest 4.4 Collecting Commonwealth Government online publications 5. Focus on users 5.1 User views of the PANDORA Archive 5.2 Most viewed titles (websites) and collections in the PANDORA Archive 6. International relations 6.1 International Internet Preservation Consortium (IIPC) 7. Promoting the Archive 7.1 Presentations, representations and papers 7.2 Social media 1 1. PANDORA participants working together PANDORA, refers to the collaborative selective web archiving program led by the National Library of Australia (NLA) with participating agencies: the state libraries of Victoria (SLV), New South Wales (SLNSW), Queensland (SLQ), South Australia (SLSA) and Western Australia (SLWA), the Library & Archives NT (LANT), the Australian War Memorial (AWM), the Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) and the National Gallery of Australia NGA). This report to contributing participants on activities and developments in the 2019-2020 financial year is made available in accordance with the National Library’s obligation as stated in section 6.2 (k) of the Memorandum of Understanding with participant agencies. 1.1 Consultation mechanisms The National Library continued to inform other PANDORA participants about the operation of PANDORA through an email discussion list, the PANDORA Wiki and an ad hoc newsletter distributed through email and the Wiki.
    [Show full text]
  • Annual Report to Partners 2015-2016
    Annual report to partners 2015-2016 Contents 1. PANDORA Participants working together 1.1 Consultation mechanisms 1.2 Reports 1.3 Collaborative collecting 2. Growth of the PANDORA Archive 2.1 Size and annual growth of the PANDORA Archive 2.2 Statistics for annual participant contributions 3. Development of the Web Archive 3.1 Extension of Legal Deposit to electronic materials 3.2 Development of PANDAS 3.3 Australian web domain harvest 3.4 Collecting Commonwealth Government online publications 4. Focus on users 4.1 User views of the PANDORA Archive 4.2 User views of the Australian Government Web Archive 4.3 Most viewed titles (websites) in the PANDORA Archive 5. Promoting the Archive 5.1 Presentations, representations and papers 5.2 Visitors to the National Library 6. Concluding summary 1 1. PANDORA participants working together PANDORA, Australia’s Web Archive (http://pandora.nla.gov.au/) is a selective archive of Australian online publications and websites which is built collaboratively by the National Library of Australia, all of the mainland state libraries, the Northern Territory Library, the Australian War Memorial, the Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) and the National Gallery of Australia. This is a report to contributing participants on activities and developments in the 2015-2016 financial year. 1.1 Consultation mechanisms The National Library continued to inform other PANDORA participants about the operation of PANDORA through an email discussion list, the PANDORA Wiki and a regular newsletter distributed through email and the Wiki. 1.2 Reports Each month, a report on the growth of the Archive and usage statistics is sent to the email discussion list.
    [Show full text]
  • A Webarchiválás Elméletének És Gyakorlatának Alapelemei a Szervezett Keretek Között Zajló Webarchiválás Kezdetei Magyarországon
    A webarchiválás elméletének és gyakorlatának alapelemei A szervezett keretek között zajló webarchiválás kezdetei Magyarországon Egyetemi doktori (PhD) értekezés a szerző neve: Németh Márton témavezető neve: Eszenyiné dr. Borbély Mária DEBRECENI EGYETEM Természettudományi és Informatikai Doktori Tanács Informatikai Tudományok Doktori Iskola Debrecen, 2021. 1 Ezen értekezést a Debreceni Egyetem Természettudományi és Informatikai Doktori Tanács Informatikai Tudományok Doktori Iskola Az információ technológia és a sztochasztikus rendszerek elméleti alapjai és alkalmazásai programja keretében készítettem a Debreceni Egyetem műszaki doktori (PhD) fokozatának elnyerése céljából. Nyilatkozom arról, hogy a tézisekben leírt eredmények nem képezik más PhD disszertáció részét. Debrecen, 2021. június 1. …………………….. Németh Márton Tanúsítom, hogy Németh Márton doktorjelölt 2015.- 2021 között a fent megnevezett Doktori Iskola Az információ technológia és a sztochasztikus rendszerek elméleti alapjai és alkalmazásai programjának keretében irányításommal végezte munkáját. Az értekezésben foglalt eredményekhez a jelölt önálló alkotó tevékenységével meghatározóan hozzájárult. Nyilatkozom továbbá arról, hogy a tézisekben leírt eredmények nem képezik más PhD disszertáció részét. Az értekezés elfogadását javasolom. Debrecen, 2021. június 1. …………………………. Eszenyiné dr. Borbély Mária A webarchiválás elméletének és gyakorlatának alapelemei A szervezett keretek között zajló webarchiválás kezdetei Magyarországon Értekezés a doktori (PhD) fokozat megszerzése érdekében
    [Show full text]
  • Guideline: Sources for Links for Openly Accessible Material for Inclusion in the ILMS
    Guideline: Sources for links for openly accessible material for inclusion in the ILMS Purpose To provide advice on the sources that are to be used for links for openly accessible material to be included in the Integrated Library Management System (ILMS). Definitions Integrated Library Management System is the system used to record the material in the library collection and support functions including acquisition, cataloguing, circulation and in general collection management. A trusted digital repository is one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future. (from https://www.oclc.org/content/dam/research/activities/trustedrep/repositories.pdf) Guideline 1. The Library wishes to record and make available to the ANU community resources which are available openly online and are not held in a university repository or collection. It is critical to ensure long term access and minimise record maintenance for the links to be to repositories or services that provide long term access to these resources. 2. The major sources for the /hyperlinks will be: a. For works from established publishers and suppliers such as JSTOR and Cambridge University Press the hyperlinks supplied by the publisher or supplier b. For works published by governments, organisations, universities and other bodies the hyperlinks from trusted repositories that are established to ensure long term management of and access to digital resources. 3. Hyperlinks from sites that are not established to provide long term access to resources, such as individual government agencies, will not be included in the ILMS. If the resource is essential for the ANU library collection a copy will be acquired and stored appropriately with a link provided to the relevant ANU location.
    [Show full text]
  • CDNLAO Australia Report 2005
    CDNLAO Australia report 2005 About LAP ¦ Find a Library ¦ Browse Directory ¦ Resources ¦ Contact us ¦ Help Country report. Australia 2005 13th CDNLAO Meeting, 2005 (Back to main CDNLAO meetings page) Library Administration 1) Library Organisational Structure (National Map) A range of Library Councils, Associations and Foundations operate in Australia. A full list is provided on the Australian Libraries Gateway at: http://www.nla.gov.au/libraries/resource/org.html. The Australian Libraries Gateway (ALG) is a free web-based directory service providing access to current information about Australian libraries, their collections and services at http://www.nla.gov.au/libraries/. The following breakdown of the number of libraries in Australia by type was obtained from this source. Note that there are other Australian libraries not covered by the following categories which may be accessed via ALG. Public libraries 1554 *Academic libraries (including TAFE and branch libraries for Australia's 38 696 universities) *Special/Private libraries (Corporate/Business, Special, Health and Law 3,076 libraries) http://www.nla.gov.au/lap/aust2005.html[2009/02/17 14:20:44] CDNLAO Australia report 2005 *Government libraries (Government and Parliamentary) 687 (* actual numbers are slightly less, because there is some double counting when libraries identify in more than one category) 2) New libraries built last year. Data on new or refurbished libraries is not readily available. Several State libraries (South Australia, Victoria, Queensland) are undertaking major building redevelopments, which span several years. 3) General trends of visitorship and membership A number of Australian research libraries have reported in recent years that onsite visits are declining, although visits to public libraries remains high.
    [Show full text]
  • John Gilchrist Thesis (PDF 1MB)
    THE GOVERNMENT AS PROPRIETOR, PRESERVER AND USER OF COPYRIGHT MATERIAL UNDER THE COPYRIGHT ACT 1968 (CTH) by John Steel Gilchrist Barrister and Solicitor of the High Court of Australia and of the Supreme Court of the Australian Capital Territory BA, LLB, LLM (Mon) GCHE (UC) A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in the Queensland University of Technology 2012 IN MEMORIAM Ian Steel Gilchrist and Violet Cherry Burnell Gregory STATEMENT OF ORIGINAL AUTHORSHIP The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made. Signature QUT Verified Signature Date: 111 ABSTRACT This thesis examines the role of government as proprietor, preserver and user of copyright material under the Copyright Act 1968 (Cth) and the policy considerations which Australian law should take into account in that role. There are two recurring themes arising in this examination which are significant to the recommendations and conclusions. The first is whether the needs and status of government should be different from private sector institutions, which also obtain copyright protection under the law. This theme stems from the 2005 Report on Crown Copyright by the Copyright Law Review Committee and the earlier Ergas Committee Report which are discussed in Chapters 2 and 8 of this thesis. The second is to identify the relationship between government copyright law and policy, national cultural policy and fundamental governance values.
    [Show full text]
  • Digital Legal Deposit in Selected Jurisdictions: Australia, Canada
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Copyright, Fair Use, Scholarly Communication, etc. Libraries at University of Nebraska-Lincoln 7-2018 Digital Legal Deposit in Selected Jurisdictions: Australia, Canada, China, Estonia, France, Germany, Israel, Italy, Japan, Netherlands, New Zealand, Norway, South Korea, Spain, United Kingdom Peter Roudik Law Library of Congress Kelly Buchanan Law Library of Congress Tariq Tariq Ahmad Law Library of Congress Laney Zhang Law Library of Congress Nerses Isajanyan Law Library of Congress Follow this and additional works at: https://digitalcommons.unl.edu/scholcom See P nextart of page the forIntellectual additional Pr operauthorsty Law Commons, Scholarly Communication Commons, and the Scholarly Publishing Commons Roudik, Peter; Buchanan, Kelly; Tariq Ahmad, Tariq; Zhang, Laney; Isajanyan, Nerses; Boring, Nicolas; Gesley, Jenny; Levush, Ruth; Figueroa, Dante; Umeda, Sayuri; Hofverberg, Elin; Rodriguez-Ferrand, Graciela; and Feikert-Ahalt, Clare, "Digital Legal Deposit in Selected Jurisdictions: Australia, Canada, China, Estonia, France, Germany, Israel, Italy, Japan, Netherlands, New Zealand, Norway, South Korea, Spain, United Kingdom" (2018). Copyright, Fair Use, Scholarly Communication, etc.. 174. https://digitalcommons.unl.edu/scholcom/174 This Article is brought to you for free and open access by the Libraries at University of Nebraska-Lincoln at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Copyright, Fair Use, Scholarly
    [Show full text]
  • Library Trends, V.54, No.1 2005
    What Should We Preserve? The Question for Heritage Libraries in a Digital World Margaret E. Phillips Abstract A primary role of national libraries is to document the published output of their respective countries. Traditionally, this has meant collecting, describing, and preserving for future generations at least one copy of every item published in print, including books, serials, newspapers, maps, music, posters, and pamphlets. In the last decade, online publishing has had a revolutionary impact on the creation, publication (dissemination), and use of information. This has presented libraries, particularly national (deposit) libraries and other cultural collecting institutions, with the daunting task of collecting, storing, describing, managing, and preserving the vast quantities of information that are being produced online. A key question to be asked when embarking on this task is, “What should be collected and preserved?” National libraries have responded to this question in different ways. Some, including the National Library of Australia, have taken a selective approach, while others have engaged in whole domain harvesting, or a “comprehensive” approach. This article discusses the advantages and disadvantages of each of these approaches and looks in some detail at the selective approach as exemplifi ed by PANDORA, Australia’s Web Archive. Introduction A primary role of national libraries and other deposit libraries is to document the published output of their jurisdictions. Traditionally this meant collecting, describing, preserving, and providing access to library materials for current and future generations. Library materials have included printed books, serials, newspapers, maps, posters, music, and pamphlets. LIBRARY TRENDS, Vol. 54, No. 1, Summer 2005 (“Digital Preservation: Finding Balance,” edited by Deborah Woodyard-Robinson), pp.
    [Show full text]
  • PANDORA - Past, Present, and Future Dr Paul Koerbin National Web Archiving in Australia Manager Web Archiving National Library of Australia
    PANDORA - past, present, and future Dr Paul Koerbin National web archiving in Australia Manager Web Archiving National Library of Australia National Conference on eResources in Malaysia Penang, Malaysia, December 2012 National web archiving in Australia 1. PANDORA Web Archive – a brief history 2. Web archiving at the National Library of Australia (NLA) today 3. Issues for future web archiving at the NLA 4. Experiences and lessons learned 5. The importance of web archiving 1. PANDORA - context and history 1. What is web archiving? 2. Why it is important to do it? 3. How did the NLA approach it from the start? 4. Why the NLA has approached web archiving in the manner we have? • Timeline of major milestones from 1996 to now What is web archiving? • Web archiving involves: – Selecting or scoping what to collect – Collecting content from the web – Preserving what we collect • Strategies, metadata, maintaining bitstream, actions – Providing access to the collection • Long term and current • Creating heritage artefacts • Creating the time dimension for the web What are we collecting? • Web sites (and all they contain) – Complex objects • Text, images, media, style elements, client side scripts • No control over formats, systems, creation of content • Includes sites with embedded media – lots of formats (mpeg, flv, QT, wmv, rm, Shockwave) • Content is harvested with crawl robots – A browser view not underlying database – Dynamic content becomes static HTML Why do web archiving? • Statutory responsibility – National Library Act (1960) – Maintain,
    [Show full text]