Ensuring Long-Term Access to Government Documents Through Geoffrey Brown Indiana University Department of Computer Science

Issues We are Trying To Solve

Documents in FDP (for example) require obsolete applications and operating systems

Installing documents to access them requires specialized expertise

These problems generalize to SUDOC documents on CD-ROM

3 7.8"3&8)*5&1"1&3

7JSUVBMJ[BUJPO0WFSWJFX

*OUSPEVDUJPO 7JSUVBMJ[BUJPOJOB/VUTIFMM "NPOHUIFMFBEJOHCVTJOFTTDIBMMFOHFTDPOGSPOUJOH$*0TBOE 4JNQMZQVU WJSUVBMJ[BUJPOJTBOJEFBXIPTFUJNFIBTDPNF *5NBOBHFSTUPEBZBSFDPTUFGGFDUJWFVUJMJ[BUJPOPG*5JOGSBTUSVD 5IFUFSNWJSUVBMJ[BUJPOCSPBEMZEFTDSJCFTUIFTFQBSBUJPOPGB UVSFSFTQPOTJWFOFTTJOTVQQPSUJOHOFXCVTJOFTTJOJUJBUJWFT SFTPVSDFPSSFRVFTUGPSBTFSWJDFGSPNUIFVOEFSMZJOHQIZTJDBM BOEGMFYJCJMJUZJOBEBQUJOHUPPSHBOJ[BUJPOBMDIBOHFT%SJWJOH EFMJWFSZPGUIBUTFSWJDF8JUIWJSUVBMNFNPSZ GPSFYBNQMF  BOBEEJUJPOBMTFOTFPGVSHFODZJTUIFDPOUJOVFEDMJNBUFPG*5 DPNQVUFSTPGUXBSFHBJOTBDDFTTUPNPSFNFNPSZUIBOJT CVEHFUDPOTUSBJOUTBOENPSFTUSJOHFOUSFHVMBUPSZSFRVJSFNFOUT QIZTJDBMMZJOTUBMMFE WJBUIFCBDLHSPVOETXBQQJOHPGEBUBUP 7JSUVBMJ[BUJPOJTBGVOEBNFOUBMUFDIOPMPHJDBMJOOPWBUJPOUIBU EJTLTUPSBHF4JNJMBSMZ WJSUVBMJ[BUJPOUFDIOJRVFTDBOCFBQQMJFE BMMPXTTLJMMFE*5NBOBHFSTUPEFQMPZDSFBUJWFTPMVUJPOTUPTVDI UPPUIFS*5JOGSBTUSVDUVSFMBZFSTJODMVEJOHOFUXPSLT TUPSBHF  CVTJOFTTDIBMMFOHFT MBQUPQPSTFSWFSIBSEXBSF PQFSBUJOHTZTUFNTBOEBQQMJDBUJPOT 5IJTCMFOEPGWJSUVBMJ[BUJPOUFDIOPMPHJFTPSWJSUVBMJOGSBTUSVD UVSFQSPWJEFTBMBZFSPGBCTUSBDUJPOCFUXFFODPNQVUJOH  TUPSBHFBOEOFUXPSLJOHIBSEXBSF BOEUIFBQQMJDBUJPOTSVOOJOH POJU TFF'JHVSF 5IFEFQMPZNFOUPGWJSUVBMJOGSBTUSVDUVSF JTOPOEJTSVQUJWF TJODFUIFVTFSFYQFSJFODFTBSFMBSHFMZ VODIBOHFE)PXFWFS WJSUVBMJOGSBTUSVDUVSFHJWFTBENJOJTUSBUPST VirtualizaUIFBEWBOUBHFPGNBOBHJOHtioNQPPMFESFTPVSDFTBDSPTTUIFFOUFS QSJTF BMMPXJOH*5NBOBHFSTUPCFNPSFSFTQPOTJWFUPEZOBNJD PSHBOJ[BUJPOBMOFFETBOEUPCFUUFSMFWFSBHFJOGSBTUSVDUVSF JOWFTUNFOUT

"QQMJDBUJPO "QQMJDBUJPO

"QQMJDBUJPO 0QFSBUJOH4ZTUFN 0QFSBUJOH4ZTUFN

0QFSBUJOH4ZTUFN 7.XBSF7JSUVBMJ[BUJPO-BZFS

Y"SDIJUFDUVSF Y"SDIJUFDUVSF

$16 .FNPSZ /*$ %JTL $16 .FNPSZ /*$ %JTL

#FGPSF7JSUVBMJ[BUJPO "GUFS7JSUVBMJ[BUJPO t4JOHMF04JNBHFQFSNBDIJOF t)BSEXBSFJOEFQFOEFODFPGPQFSBUJOH t4PGUXBSFBOEIBSEXBSFUJHIUMZDPVQMFE TZTUFNBOEBQQMJDBUJPOT t3VOOJOHNVMUJQMFBQQMJDBUJPOTPOTBNFNBDIJOF t7JSUVBMNBDIJOFTDBOCFQSPWJTJPOFEUPBOZ PGUFODSFBUFTDPOGMJDU TZTUFN t6OEFSVUJMJ[FESFTPVSDFT t$BONBOBHF04BOEBQQMJDBUJPOBTBTJOHMF VOJUCZFODBQTVMBUJOHUIFNJOUPWJSUVBM t*OGMFYJCMFBOEDPTUMZJOGSBTUSVDUVSF NBDIJOFT

'JHVSF7JSUVBMJ[BUJPO  Source: Virtualization Overview, Copyright VMware

4

Model Architecture

Document Document Repository

Web Software Repository Compute Server Supporting Files Server Software OS

7 Actions in Response To Patron Request

A pre-configured emulator is allocated Emulator is customized Document file system mounted Document specific installation executed Shared file directories created for patron use Link to emulator and web accessible file system provided through patron browser Emulator executes remotely under patron control

8 Software required For FDP

Windows 98 (most disks were for msdos, win 3.1) DBase III -- (dbfviewer2000) WordPerfect Lotus 1-2-3 (smartsuite, smartsuite viewer) Microsoft Word (we use free msword viewer) Various Archive Tools Browser (we use Firefox) Generic Postscript Printer Driver Software we didn’t install -- pascal, fortran, sas, arcinfo

9 SUDOC Virtualization Project

Approximately 2500 SUDOC CD-ROMs in IU Library

We’ve built an online database of roughly 1800 CD-ROMs (about 1500 items)

Analyzed software requirements for about 1000

10 SVP

11

Philosophical Issues

Who Preserves the Emulator ?

Why Not Just Migration ?

18 100 90 80 Office Applications 70 User Interfaces 60 Operating Systems 50 40 30 20 Million Lines of Code 10 1

900 800 700 600 500 400 300 200

Thousand Lines of Code 100 0 KDE Xpdf Qemu Plex86 Koffice Mozilla Pearpc Dosbox 3.0 Dosemu Open Office Windows 98 Windows XP Windows 3.1 Xen Why not just migration

Loss of information -- .g. word edits

Loss of fidelity -- e.g. WordPerfect to Word isn’t very good

Loss of authenticity -- users of migrated document need access to original to verify authenticity

Not always possible -- closed proprietary formats

Not always feasible -- costs may be too high

Emulation may necessary to enable migration

20

Goals

Deliver key SUDOC collections through virtualization

Develop web delivery techniques

Improve our software analysis tools

Develop image customization techniques (e.g. perform software install on the fly)

23 How this Work Might be Used

Libraries share pool of software images and licenses

Libraries share expertise in supporting various document collections

Libraries collaborate to provide redundancy

Patrons access from anywhere without needing to obtain or install special software

24 acknowledgments

Lou Malcomb (IU Head GIMSS)

Julianne Bobay (IU Head SLIS Library) government Search IUB Libraries SERVICES FOR: Faculty Graduate Students Undergraduates Distance Learners Visitors IUCAT | Ask a Librarian

25 Find Information Libraries & Collections Library Services About IUB Libraries IUCAT Library Catalog Collections by Subject Getting Started A - Z Search All Collection Managers Borrow, Renew & Request Welcome from the Dean Databases A-Z Libraries at IUB E-Reserves Hours & Locations Databases by Subject Libraries at Other IU Campuses Class Reserves Alumni & Donors Databases by Type New Resources Ask a Librarian Departments & Staff Online Full-Text Journals Scholarly Communication Computing & More OneSearch @IU IUScholarWorks Employment Plan now to attend welcome tours and get-to-know-us workshops......

Puzzling Exhibition Opens at Explore Places of the Know Your Library: Join Our Lilly Library Imagination Welcome Activities ......

http://www.libraries.iub.edu/ Comments to [email protected] Copyright 2001 - 2006, The Trustees of Indiana University IUB Home | IUB Libraries Privacy Policy | Sitemap | Help