Wayback: a User-Level Versioning File System for Linux

Wayback: a User-Level Versioning File System for Linux

Wayback: A User-level Versioning File System For Linux Brian Cornell, Peter Dinda, and Fabián Bustamante Computer Science Department Northwestern University 1 Really excited last night ● Great idea for this presentation ● 3d rotating slides... ● Woke up this morning, Oh no! ● Too excited: no backup copy ● No RCS1 or CVS2 check-in ● Lucky, Wayback was there! ● Reverted back to yesterday afternoon, and here we are, aren't you glad? 1: F. Tichy, 1980. 2: B. Berliner, 2001. 2 Outline ● Overview ● Related work ● How it works – User's perspective ● Behind the scenes – System's perspective ● Design decisions ● Performance ● Future work 3 Wayback: Automatic Versioning ● Automatic: Versions files and directories without user interaction ● Universal: Versions files of all types ● Comprehensive: Never loses data ● Generic: Works over any directory on any base FS ● User-space: Implemented in user-space Free (GPL): http://wayback.sourceforge.net 4 Related Work ● VMS operating system [K McCoy, 1990] – Version on close ● Elephant [D Santry, et al, 1999] – Exploit massive disks ● VersionFS stackable file system [Muniswamy- Reddy, et al, 2004] – Contemporaneous to this project – Implements different features ● Cedar [D. K. Gifford, et al, 1998], 3DFS [D. G. Korn, et al, 1989], CVFS [C. A. Soules, et al, 2003] 5 How does it work? ● Three simple steps ● Remount folder ● Edit and work as normal ● Revert or extract old versions 6 Remount folder ● Run wayback executable to remount – wayback /mnt/data /home/brian/Projects ● Use mount.wayback script to simplify and mount for all users – mount.wayback /mnt/data /home/brian/Projects ● Files still at /mnt/data, but if you edit them at /home/brian/Projects, they are versioned 7 Work normally ● Edit files with any editor ● Modify binary files too ● Move, rename, delete files ● Edit directories as much as you want ● Everything you do is recorded and remembered 8 Revert or extract versions ● List the versions of a file – vstat Wayback.sxi ● Revert a file to an old version – To a time: vrevert -d 15:00:00 Wayback.sxi – To a date: vrevert -d 2004:06:29:15:00:00 Wayback.sxi – To a numbered version as listed by vstat: vrevert -n 5 Wayback.sxi ● Revert a directory to a date/time – vrevert -d 2004:06:29:15:00:00 Documents ● Note: Even reverting can be undone 9 Revert or extract versions ● Extract versions to a different file – Same format as vrevert with a destination – vextract -d 2004:06:29:15:00:00 Wayback.sxi GoodWayback.sxi ● Delete the versioning if you're sure you don't need it anymore – vrm Wayback.sxi – All versioning info for the file is permanently deleted – File is not deleted. To delete file and versioning, first delete file, then versioning. 10 Behind the Scenes ● FUSE1 module sends all FS calls to Wayback ● Everything kept track of transparently ● Hidden log files, one per file or directory ● File operations recorded ● Directory operations recorded 1: M. Szeredi,, Filesystem in USEr space, http://sourceforge.net/projects/avf 11 Log Files ● Every versioned file has a hidden log file – Wayback.sxi. versionfs! version ● Every directory has a hidden catalog – Documents/. versionfs! version ● Logs created as soon as needed ● Logs omitted from directory listing by Wayback ● Logs in binary custom format 12 File Operations ● On every write or Log file truncate, entry Previous appended to log Log Entries ● Data needed to undo ... write or truncate is Timestamp written Write Offset Overwritten Size File Enlarged Flag Overwritten Data 13 Directory Operations ● When files are changed Catalog file in a way that changes Previous their directory entries, Catalog directory catalog is Entries ... updated Timestamp ● Every operation writes Action ID timestamp and action ID ● Action Specific data per Specific operation Data 14 Dir Ops: Create, Mkdir ● Just needs to know name so it can be deleted on revert Length of Name File or Directory Name 15 Dir Ops: Rm ● First, file is truncated to 0 bytes – File version log now has backup ● Record file attributes – Mode, owner, times ● Record filename ● Record destination for symbolic link Attributes Contents: Size of Data Attributes Mode UID GID Filename Access Time Modified Time Create Time [Link dest] 16 Dir Ops: Rmdir, Rename ● Rmdir actually does rename and omits from directory listing – Don't want to lose logs for files in directory ● Record old and new names Length of Names Old and New Names 17 Dir Ops: Chmod, Chown, Utime ● Just need to know the filename and the old attributes Size of Data Attributes Filename 18 Design Decisions ● User vs kernel level – User: Remount any base FS – User: Development easier, more stable – Kernel: Faster ● Undo vs redo log – Undo: Everything always retained as long as current version is not lost – Undo: No overhead of initial file copy – Redo: Files recoverable even if the current version is lost 19 Design Decisions ● FUSE vs other user level modules – It's modern and actively developed – It works with kernels 2.4 and 2.6 – It gives complete but simple access to FS operations – It's written in C (with which we are more familiar) 20 Design Decisions ● Comprehensive versioning vs others – Record every operation that ever happens – Periodic: Choose time-granularity of versioning – Version on close: Assume one logical operation per file open ● Individual log per file vs central log – Individual: No central point of failure – Individual: Simple, logs stay with files – Central: Doesn't consume inodes as fast 21 What's the cost of Wayback? ● 3 performance tests – Bonnie1: Large file reads and writes – Andrew2: Typical use and compilation estimation – RCS: Versioning comparison with RCS ● Tested with ext3, FUSE, and Wayback 1: T. Bray, http://www.textuality.com/bonnie 2: J. Howard, et al, 1988. 22 Test machines ● Tests on 3 machines – Machine A: Normal ● Athlon XP 2400 ● 512 Mb memory ● 2.5 in notebook hard drive – Machine B: Slow disk ● Pentium IV 2.2 Ghz ● 512 Mb memory ● USB 1.1 hard drive – Machine C: Slow processor ● Celeron 500 Mhz ● 128 Mb memory ● 2.5 in notebook hard drive 23 Bonnie: 30-70% max overhead 18 16 14 12 s / B 10 ext3 M FUSE 8 Wayback 6 4 2 0 Write per Write per Rewrite Read per Read per character block character block 24 Modified Andrew Benchmark ● Andrew uses Makefile to run five phases with Sun graphical source code ● Modification uses Perl and Linux wm ● Five phases: – 1: Directory creation (mkdir) – 2: Copying files (cp) – 3: Stating files (ls -l) – 4: Reading files (grep and wc) – 5: Compilation (gcc) 25 Andrew: Compile 15% slower 1100 1000 900 800 ) ms 700 ( Ext3 e 600 FUSE Wayback Tim 500 400 300 200 100 0 mkdir cp ls -l grep, wc gcc / 20 26 RCS Comparison ● 3 traces – Binary write: ● Random small changes with write system call – Binary file: ● Random small changes, but rewrite whole file – Text: ● Random changes to lines in text file ● 11 configurations – Wayback – RCS checkin after every k changes, k=1..10 27 RCS: Wayback Faster 50 45 40 ) 35 (s e 30 Binary write m i Binary file T 25 Text x 10 20 15 10 5 0 Wb 1 2 3 4 5 6 7 8 9 10 28 Future Work ● Log compression – Remove unnecessary granularity ● Garbage collection – Remove old versions ● Limiting – Limit number or size of versions ● Customizable versioning – Filter to only version some files, not others ● Migration of virtual machines – Undo/redo to sync virtual machines 29 Summary ● Don't lose important data! ● RCS, CVS, etc require user action ● Wayback automatically versions – Remount directory with versioning – Edit files as normal – Writes are written as undo information – Easy to revert or extract to any time ● Wayback faster than RCS even excluding user time 30 Get it! It's free! (Gnu General Public License) http://wayback.sourceforge.net 31 RCS Size Comparison (MB) File Type Binary Write Binary File Text Wayback 1157456.4 106347428.0 2182218.0 RCS Period 1 2242325.4 3856521.8 101062.2 RCS Period 2 2237020.7 3779180.4 96134.2 RCS Period 3 2233854.1 3731427.8 94336.4 RCS Period 4 2234384.4 3719578.2 93597.1 RCS Period 5 2233716.4 3700853.4 93095.3 RCS Period 6 2227924.3 3621657.1 92375.5 RCS Period 7 2230300.3 3635107.2 92321.2 RCS Period 8 2227552.2 3590195.5 91960.0 RCS Period 9 2231124.4 3625548.8 92060.8 RCS Period 10 2232218.7 3629717.4 92045.2 32 Slow Disk: Bonnie Performance 1.1 1 0.9 0.8 s 0.7 / B ext3 0.6 M FUSE 0.5 Wayback 0.4 0.3 0.2 0.1 0 Write per Write per Rewrite Read per Read per character block character block 33 Slow Processor: Bonnie Performance 18 16 14 12 s / B 10 ext3 M FUSE 8 Wayback 6 4 2 0 Write per Write per Rewrite Read per Read per character block character block 34 Slow Disk: Andrew Performance 2250 2000 1750 ) s 1500 (m Ext3 e 1250 FUSE m i T Wayback 1000 750 500 250 0 mkdir cp ls -l grep, wc gcc / 20 35 Slow Processor: Andrew Performance 3000 2750 2500 2250 ) 2000 ms ( 1750 Ext3 e FUSE 1500 Tim Wayback 1250 1000 750 500 250 0 mkdir cp ls -l grep, wc gcc / 20 36 Slow Disk: RCS Performance 60 55 50 45 ) (s 40 e Binary write m 35 i Binary file T 30 Text x 10 25 20 15 10 5 0 Wb 1 2 3 4 5 6 7 8 9 10 37 Slow Processor: RCS Performance 110 100 90 80 ) s ( 70 e Binary write 60 Binary file Tim Text x 10 50 40 30 20 10 0 Wb 1 2 3 4 5 6 7 8 9 10 38 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    38 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us