Master’s thesis Venti analysis and memventi implementation Designing a trace-based simulator and implementing a venti with in-memory index Mechiel Lukkien
[email protected] August 8, 2007 Committee: Faculty of EEMCS prof. dr. Sape J. Mullender DIES, Distributed and Embedded Systems ir. J. Scholten University of Twente ir. P.G. Jansen Enschede, The Netherlands 2 Abstract [The next page has a Dutch summary] Venti is a write-once content-addressed archival storage system, storing its data on magnetic disks: each data block is addressed by its 20-byte SHA-1 hash (called score). This project initially aimed to design and implement a trace-based simula- tor matching Venti behaviour closely enough to be able to use it to determine good configuration parameters (such as cache sizes), and for testing new opti- misations. A simplistic simulator has been implemented, but it does not model Venti behaviour accurately enough for its intended goal, nor is it polished enough for use. Modelled behaviour is inaccurate because the advanced optimisations of Venti have not been implemented in the simulator. However, implementation suggestions for these optimisations are presented. In the process of designing the simulator, the Venti source code has been investigated, the optimisations have been documented, and disk and Venti per- formance have been measured. This allowed for recommendations about per- formance, even without a simulator. Beside magnetic disks, also flash memory and the upcoming mems-based storage devices have been investigated for use with Venti; they may be usable in the near future, but require explicit support. The focus of this project has shifted towards designing and implementing memventi, an alternative implementation of the venti protocol.