Memento: Time Travel for the Web Herbert Van de Sompel Michael L. Nelson Robert Sanderson Los Alamos National Old Dominion University, Los Alamos National Laboratory, NM, USA Norfolk, VA, USA Laboratory, NM, USA
[email protected] [email protected] [email protected] Lyudmila L. Balakireva Scott Ainsworth Harihar Shankar Los Alamos National Old Dominion University, Los Alamos National Laboratory, NM, USA Norfolk, VA, USA Laboratory, NM, USA
[email protected] [email protected] [email protected] ABSTRACT 1. INTRODUCTION The Web is ephemeral. Many resources have representa- \The web does not work," my eleven year old son com- tions that change over time, and many of those represen- plained. After checking power and network connection, I tations are lost forever. A lucky few manage to reappear realized he meant something rather more subtle. The URI as archived resources that carry their own URIs. For ex- (http://stupidfunhouse.com) he had bookmarked the year ample, some content management systems maintain version before returned a page that didn't look like the original at pages that reflect a frozen prior state of their changing re- all, and definitely was not fun. He had just discovered that sources. Archives recurrently crawl the web to obtain the the web has a terrible memory. actual representation of resources, and subsequently make Let us restate the obvious: the Web is the most pervasive those available via special-purpose archived resources. In information environment in the history of humanity; hun- both cases, the archival copies have URIs that are protocol- dreds of millions of people1 access billions of resources2 using wise disconnected from the URI of the resource of which a variety of wired or wireless devices.