IBM Research - Haifa
Long Term Digital Preservation: An IT Perspective
Simona Rabinovici-Cohen IBM Research - Haifa [email protected] June 22, 2011
Presented at TRANSISTOR 2011: Preservation techniques and methodologies for digital audiovisual works Crete, June 22-25, 2011 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa IBM Research – Over 3,000 Researchers Worldwide
2 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Identity Card
° Haifa research lab – IBM’s largest Research facility outside the US – Employs ~500 researchers – Spans many IBM Research strategy areas
° Storage Systems research group – Our mission is to advance the state of art of IBM’s storage systems and data management products – We conduct research in advanced storage functionalities – Very active in data preservation – Partner in concluded CASPAR EU project – Lead ENSURE EU project – Lead standardization efforts in SNIA
3 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Agenda
° Background – The Challenge of LTDP – Preservation Approaches – The OAIS Standard
° Haifa Storage Tools for LTDP – Preservation DataStores (PDS) – LTDP Assessment Tool – SIRF Standardization – PDS Cloud
° Publications
4 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa The Long Term Digital Preservation (LTDP) Challenge
° These documents were created by pre-digital societies. The media and information content are still interpretable.
Dead Sea Scroll, ~70AD. Media: Copper. Language: Hebrew.
Mayan Glyph, Palenque ~630AD.
° This information was created a few years ago. – Will the media last for 20 years? – Will it be possible to access, interpret and present the data in 20 years? 50? 100?
5 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
The Incredible Growth of Digital Data
° IDC Report 3/2007: 6-Fold growth in 4 • IDC Report 5/2010: 44-Fold growth in 11 years years – 2006 – 161 exabytes (10^18 bytes) • 2009 – 0.8 zetabytes (10^21 bytes) data was created • 2010 – 1.2 zetabytes • 3 million times the information in all the books ever written • 2020 – 35 zetabytes • 12 stacks of books from Earth to the • 1 ZB is a pile of DVDs over Sun 250000km high • You could wrap earth for 4 times – 2010 – 988 exabytes
6 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Analog vs. Digital Preservation of Information
Millennium Very Hard
Centuries Hard
Decades
y e e t t t it em m c c ex ce gr st ti tim je je nt n te y ife fe b b o a in s Legend l li t o o c en t n ia r c ad ’s v c tio d to ra e ct ro je a e c xt r e p b rv M fa e to bj s o e Digital o o t’ g s rm t ity d ec in re fo ty il n j ur p l ili b ta ob s e ca b A rs g n th Analog si A e in E g hy nd w in P U no rv K se re P 7 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
The Digital Dark Age: Data Sustainability Paradox
° The world becomes more digital with richer interpretations and usages for the data – Growth in “born-digital” data: HDTV, digital cameras, healthcare devices, imaging – Conversion of formerly analog information to digital: Films, voice calls, TV signals ° But preservation of digital data is much more difficult than analog data
Paradox As the world becomes digital, the world’s data is more in danger to be lost ! Our ability to store digital bits increases, but our ability to store them over time decreases !
8 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
What’s Driving LTDP? – The SNIA Survey
Preservation of business history Other Top External Factors Driving Preservation of business history Other Protection of customer privacy Security Risk Long-Term Retention Protection of customer privacy Security Risk Protection of business or Protection of business or Business Risk Security Risk intellectual assets Business Risk Security Risk Requirements: intellectual assets Retaining history for Retaining history for Business Risk competitiveness or protection Business Risk Legal Risk, competitiveness or protection Protection from compliance or Compliance Protection from compliance or Compliance legal fines Requirements Compliance Regulations, legal fines Requirements Meeting regulatory Compliance Meeting regulatory Compliance requirements Requirements Business Risk requirements Requirements Meeting regulatory Meeting regulatory Legal Risk requirements Legal Risk requirements Concern with ligitation Concern with ligitation Legal Risk protection Legal Risk protection 0% 10% 20% 30% 40% 50% 60% 0% 10% 20% 30% 40% 50% 60% Percent of Respondents Percent of Respondents
Source: SNIA-100 Year Archive Requirements Survey, January 2007. >100 Years What does Long-Term 18.3% 38.8% >50-100 Years Mean? 13.1% >21-50 Years More than 20 years 15.7% >11-20 Years retention is required by 12.3% >7-10 Years 70% of polls.
1.9% >3-6 Years
0.0% 5.0% 10.0% 15.0% 20.0% 25.0% 30.0% 35.0% 40.0% 9 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
What’s Driving LTDP? – Data for Future Use
° Limone Sul Garda brought new drug via genes preservation – 1000 residents – Many have long life (40 residents live 100+) – No thickness of blood vessels – even if cholesterol is high
– Many residents have the gene that generates A1-milano protein – A1-milano quickly removes fat from arteries leading it to liver – A new drug for cardiovascular diseases was discovered
° Can we do this with digital data? Should we preserve data for future unknown benefit?
10 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Bit Preservation vs. Logical Preservation
° Bit preservation – ability to restore the bits in the presence of – storage media degradation, storage media obsolescence, environmental catastrophes like fire, flooding, etc. • The life-span of disks: 3-5 years, tapes: 5-10 years, CDs and DVDs: 10- 20 years – Products exist to some extend – copy services, refreshment, error correcting codes modules ° Logical preservation - preserving the understandability and usability of the data in the future – current technologies for computer hardware and software may not exist anymore, and the users of the data may not be born yet. – How does one ensure the provenance of the data? – How does one ensure only legitimate users can access the data? – Technology is still in research phase
11 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation Approach - Museum
° Museum Approach to LDTP –Original state of content and rendering devices preserved –Maintained and operational –Pros • No loss of information –Cons • Expensive • Time bounded • Not scalable • Warranty + spare parts
12 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation Approach - Emulation
° Emulation Approach to LDTP –Adapt rendering device by emulating it • Up to date software + computers –Pros • Reduces problem to preserving emulation platform • Cost proportional to number of rendering formats –Cons • Upfront investment • Only for data coupled with software • Does not allow new interpretations of the data
13 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation Approach - Migration ° Migration Approach to LDTP –Migrate to newer formats –Pros • Less investment when data ingested • Allows new uses of data –Cons • Can introduce noise • Cost proportional to data size • Continuous cost
14 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation Approach - Descriptive
° Descriptive Approach to LDTP –Add metadata to fully describe representation of data • Allows writing code in future to process format
Capture –Pros • No loss of information Data • Minimal assumptions on future Storage Render • Delays cost until needed –Cons Metadata • Doesn’t support proprietary formats • May have future high cost
15 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation Approach - Encapsulation ° Encapsulation Approach to LDTP –Group together data and related metadata • Includes instructions to enable future interpretation –Pros • Most flexible • Consistent with everything but Museum approach • OAIS compliant –Cons • Doesn’t tell you what to do
16 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Open Archival Information System (OAIS) ° ISO standard reference model (ISO:14721:2002) Functional Model ° Provide fundamental ideas, concepts and a reference model for long-term archives ° Archival Information Package (AIP) - a logical structure for the preservation object that needs to be stored to enable future interpretation ° Content Data Object (CDO) – raw data to be preserved
Information Model
AIP 17 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa OAIS AIP Logical Structure
Content Information Preservation Descriptive Information
Reference Provenance Representation Information Representation Context Fixity Information
Access Rights
0-1 1-* ° Content Data Object - the raw data that is the focus of the preservation. ° Representation Information (RepInfo) – the information required to interpret the raw data to its designated community. ° Reference – globally unique and persistent identifiers for the content information. ° Provenance – the history and the origin of the content information and any changes that may have taken place since it was originated, and who has had custody of it since it was originated. ° Context – documents reason for creation of the content information and relationship to its environment. ° Fixity – a demonstration that the particular content information has not been altered in an undocumented manner. ° Access Rights - the information that identifies the access restrictions pertaining to the Content Information, including the legal framework, licensing terms, and access control.
18 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation in a Nutshell with OAIS
Object to Preserve Usable Preserved and Metadata Object
Submission Information Dissemination Information P r
Package (SIP) Package (DIP) e L s o e g r
Create i Extract AIP, v c a Archival “tools” and a t l Information Descriptive i
MetaData o Logical Emulators Package Information needed to make n (AIP) Transformations AIP usable
e.g. AIP for Word e.g. doc to pdf/a e.g. VM image with e.g. (1) spec of format 2003 SP3 document transformation Office 2003 SP3 (2) text summary P r e s e B r i v t
a t i o n
19 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Agenda
° Background – The Challenge of LTDP – Preservation Approaches – The OAIS Standard
° Haifa Storage Tools for LTDP – Preservation DataStores (PDS) – LTDP Assessment Tool – SIRF Standardization – PDS Cloud
° Publications
20 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa What do we have in Haifa? ° Infrastructure: Preservation DataStores (PDS) – CASPAR: Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval • PDS provides the storage infrastructure of CASPAR EU project – Archiving: Long term retention capabilities to existing systems • Demo of preservation support to enterprise content management (ECM) and archiving systems – ENSURE: Enabling kNowledge, Sustainability, Usability and Recovery for Economic Value • Examining use of cloud for preservation infrastructure in ENSURE EU project
° Assessment: Long Term Digital Preservation Assessment (LTDPA) – Research tool to evaluate organization’s ability to preserve its digital resources. Based upon emerging standard audit checklists (ISO 14721)
° Standards: Storage Networking Industry Association (SNIA) – IBM co-chairs the Long Term Retention technical working group – LTR develops a Self-contained Information Retention Format (SIRF)
21 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation Aware Storage: A New Storage Paradigm
Preservation Aware Storage storage component of a digital preservation system that has built-in support for preservation.
° While traditional storage supports bit preservation at most, preservation aware storage supports logical preservation as well. ° Preservation Aware Storage supports offloading functionality to the storage layer – Decrease the probability of data loss – Simplify the applications – Provide improved performance and robustness
22 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Preservation DataStores (PDS) Overview
° OAIS–based preservation-aware storage that supports LTDP ° Offload OAIS-based functionality to: – Decrease probability of data loss – Simplify the applications – Provide improved performance and robustness ° Manage preservation metadata ° Supports automation of preservation processes
° PDS is the storage infrastructure of EU project CASPAR ° PDS is available at alphaWorks - http://www.alphaworks.ibm.com/tech/pds
23 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
PDS Main Functionality
– AIP generation - generation of preservation metadata and creation of AIPs with various packaging formats • Metadata enrichment - automatic extraction of metadata from the submitted content data and addition of RepInfo and Preservation Descriptive Information (PDI)
– Data transformations - provide the ability to load transformation modules (storlets), apply them on AIPs and generate new AIP versions • Storlets are restricted modules with predefined interfaces used to execute data intensive functions, e.g., transformations, fixity calculation
– Fixity management – flexible periodic fixity (integrity) checks where multiple loadable fixity modules can be used and the fixity values are stored in a standard PREMIS (v2) format
– RepInfo management - allows sharing, search and categorization of RepInfos
24 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
CASPAR and PDS
•PDS was deployed in ESA for GOME data •PDS was deployed in ASemantics •PDS is available at AlphaWorks 25 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
AIP Generation Example: Preserving an Audiovisual Object
° Object – A Windows Media Player video clip ° “Kia playing violin for gesture analysis on 2007-02-16 at 12:42:17 in ICSRiM - University of Leeds” – This is part of i-Maestro project demonstration and shows violin bowing visualization. – This is a synchronized version of video, sound and 3d motion for analysis of violin bowing.
26 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
AIP Contents
Content Information Preservation Descriptive Information
Content Data Object Reference Provenance
Representation Information Context Fixity
Access Rights Rep Info
27 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Content Data Object
Video clip in Windows Media Video format
28 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Representation Information
Record 1: Windows Media Player Homepage (URL)
Record 2: Notes sheet (PDF)
29 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Provenance
Record 1: Pre-ingest >
30 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Context
Record 1 http://www.i-maestro.org/
Record 2 http://www.icsrim.org.uk/
31 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Fixity
Record 1: External
Record 2: Internal
32 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Transformation with Storlets
ingest original AIP PDS Web Services
PDS Server original content
original RepInfo Original AIP
original PDI*
*Preservation Descriptive Information
33 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Transformation with Storlets
load transformation PDS Web Services
PDS Server transformation module (content)
RepInfo for transformed Original AIP Transformation AIP content
PDI of transformation RepInfo of transformation
34 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Transformation with Storlets
invoke transform AIP PDS Web Services
PDS Server
Original AIP Transformation AIP Transformed Content
RepInfo for Transformed PDI generated Content New AIP by PDS
35 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa The LTDPA tool helps in assessing the capabilities of an archival organization to deliver Long Term Digital Preservation services
Based on OCLC RLC audit checklist, the tool helps in evaluating the compliance level of Organization, Processes and Technology with the OAIS reference model and best practices.
The Tool holds: ° Knowledge of expert community & ISO Best Practices ° Assessment checklist ° Evaluation metrics The Tool enables: ° web based data collection ° quantitative analysis ° report generation ° common repository buildup & usage The Tool can be used for: ° Identifying gaps between Current State vs. Best Practice or vs. Desired State ° Comparative analysis and Industry benchmarks ° Knowledge Transfer 36 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
LTDPA Tool Workflow
1 Gather Data (1) Within an LTDPA engagement, engagement definition is set up, indicating individual client respondents. LTDPA Repository
The respondents can then log-in using the LTDPA web survey tool, fill out surveys and save their responses back to the repository.
(2) With all of the client responses 2 Analyze & Report stored in the database, the engagement leader can now load the collective set of clients results, view basic statistical results, analyze the data, and export diagrams and data LTDPA Repository to MSOffice formats for deliverable creation
(3) Since all the data gets stored in 3 Historical & Benchmarking Analysis ‹ Year over Year the single repository, the results from a given engagement can be used ‹ Group vs Group again, either as a time sequence LTDPA Repository when the assessment is performed ‹ Client vs Industry again at that same client, or as part of a benchmarking exercise, etc. ‹ Etc. Based on CAT Overview presentation, Matt Callery, IBM Research, Fall 2006 37 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa “Client X” LTDP Assessment Summary – Current-State and Desired State Maturity-Levels
Index:
38 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Self-contained Information Retention Format (SIRF)
° Being developed by Storage Networking Industry Association (SNIA), Long Term Retention (LTR), Technical Working Group (TWG) – Co-chaired by IBM and Symantec
° SIRF is a logical container format appropriate for long-term storage of digital information – Preserves collections of objects and their relationships – Includes generic metadata that can be extended with domain specific information for fast access – Can be mapped to and physically migrated between a wide variety of underlying storage systems
° SIRF use cases and requirements document is released for public review – http://www.snia.org/tech_activities/publicreview
39 39 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
An Analogy ° Standard archival box Photo courtesy Oregon State Archives – Archivists gather together a group of related items, known as a collection – Collection is placed in a physical box container – The box is labeled with information about its content e.g., name and reference number, date, contents description, destroy date • And there’s an online (XML) finding aid – When contents migrated they’re added to box
° SIRF is the digital equivalent – Logical container for a set of (digital) preservation objects and a catalog – The SIRF catalog contains metadata related to the entire contents of the container as well as to the individual objects – SIRF standardizes the information in the catalog 40 40 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
SIRF Components
A SIRF container includes: ° A magic object: identifies SIRF container and its version ° Numerous preservation objects that are immutable ° A catalog that is – Updatable – Contains metadata to make container and preservation objects portable into the future without external functions
41 41 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Agenda
° Background – The Challenge of LTDP – Preservation Approaches – The OAIS Standard
° Haifa Storage Tools for LTDP – Preservation DataStores (PDS) – LTDP Assessment Tool – Standardization – PDS Cloud
° Publications
42 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Cloud Computing: Hottest Topic in the Industry…
43 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Cloud Technologies
° Usage – Amazon S3 stores over 260 billion objects today and 1 trillion by 2012 – 15% of all digital data will be in the cloud by 2020 (IDC) with another 20% touching the cloud ° Why is it so appealing for preservation? – Cost, availability, scalability
44 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Cloud Computing: What’s Driving it?
1. Cost Reduction: vs. –Cloud: Highly virtualized with many users sharing the same hardware
2. Technology Maturity Cycle °New: Wow, it works! –Business: Focus higher in the solution stack –Cloud: Companies who are moving to the cloud are focusing on their business, not technology. vs.
3. Payment model: Pay per use to reduce bar of adoption vs. –Cloud: Pay per use with immediate provisioning
45 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Preservation in the Cloud: Is it an option? Or can it be made into an option?
° Relevance to LTDP – Can YouTube become the long-term-digital-preservation solution for videos? – Can GoogleDocs become the long-term-digital-preservation solution for documents? – Etc, ect… ° There’s a lot missing today – Security and SLA guarantees – Access and performance model (% downloads vs % uploads) – Auditability, compliance and regulatory – Long term trust in the cloud provider – Preservation layer
46 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa ENSURE: Enabling kNowledge Sustainability, Usability and Recovery for Economic value
° ENSURE is a recently started (Feb ’11) FP7, Call 6 EU Project, coordinated by IBM in the area of digital preservation
° There is a need to take a more business/industry-oriented focus
° ENSURE addresses this need by focusing on HCLS and finance use cases
° In addressing this need, ENSURE’s specific objectives are driven by the needs of businesses and regulatory compliance
47 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Overview of ENSURE’s area for innovation
Ability to compose different quality Apply cutting edge ICT to digital solutions at different costs preservation solutions
Preservation Lifecycle Management Content-aware long-term data of environmental changes, evolution protection of ontologies, quality of digital objects etc. over time
4488 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Benefits and issues in using a cloud model for digital preservation
Cloud Security Cloud TechnologyT he Benefits of Clouds: Requirements Requirements • Scalable in number of – Support for object – Multi-cloud support o(bejxepcotsrt ,a nd size of data provenance, certification, replication) • Pay-as-you-go auditing, … – Programmatic vis•ibSilihtyaring across geographic – Trust over time (SLAs, events) domains – available – Computation near daantyawhere – Changes over time – Integration with lifecycle management
4499 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa PDS Cloud: Extending PDS to the Cloud
AIP
° Map OAIS AIP and the links among AIPs to the cloud data model ° Multi cloud support while considering self-containment and self-describing implications ° Study the use of computational cloud storage for preservation (PDS Storlets) ° Support flexible integrity (fixity) checks and auditing capabilities ° Map preservation policies to the cloud including use of offline media
50 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Agenda
° Background – The Challenge of LTDP – Preservation Approaches – The OAIS Standard
° Haifa Storage Tools for LTDP – Preservation DataStores (PDS) – LTDP Assessment Tool – Standardization – PDS Cloud
° Publications
51 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa Publications
° “Towards SIRF: Self-contained Information Retention Format” – 4th Annual International Systems and Storage Conference (SYSTOR), May 30-June 1, 2011 ° “Preservation DataStores” chapter in “Advanced Digital Preservation” book – www.springer.com/978-3-642-16808-6 ° "Using XFDU for CASPAR Information Packaging" – OCLC Systems & Services: International Digital Library Perspectives, Vol. 26 No. 2, 2010 ° "Authenticity and Provenance in Long Term Digital Preservation: Modeling and Implementation in Preservation Aware Storage“ – USENIX First Workshop on the Theory and Practice of Provenance (TaPP), February 23, 2009, San Francisco ° “Preservation DataStores: New Storage Paradigm for Preservation Environments“ – IBM Journal of Research and Development on storage Technologies and Systems, Volume 52, Number 4/5, 2008 ° “Preservation DataStores: Architecture for Preservation Aware Storage” – IEEE Conference on Mass Storage Systems and Technologies (MSST), September 2007, San Diego, USA. ° “The Need for Preservation Aware Storage - A Position Paper". – ACM SIGOPS Operating Systems Review, Special Issue on File and Storage Systems, Volume 41, Issue 1 (Jan 2007), pp 19-23. ° “Towards OAIS-Based Preservation Aware Storage - A White Paper“. – http://www.haifa.il.ibm.com/projects/storage/datastores/public.html
52 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
Many Thanks … to the IBM Haifa Team !!
Shimon Agassi Ealan Henis Aner Hamama Orit Edelstein Michael Factor John Marberg Kenneth Nagin Dalit Naor Leeat Ramati Petra Reshef Shahar Ronen Eliot Salant
53 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation IBM Research - Haifa
54 http://www.haifa.il.ibm.com/projects/storage/datastores/index.html © 2011 IBM Corporation