The Automatic Improvement of Locality in Storage Systems
Total Page:16
File Type:pdf, Size:1020Kb
The Automatic Improvement of Locality in Storage Systems Windsor W. Hsuyz Alan Jay Smithz Honesty C. Youngy yStorage Systems Department Almaden Research Center IBM Research Division San Jose, CA 95120 fwindsor,[email protected] zComputer Science Division EECS Department University of California Berkeley, CA 94720 fwindsorh,[email protected] Report No. UCB/CSD-03-1264 July 2003 Computer Science Division (EECS) University of California Berkeley, California 94720 The Automatic Improvement of Locality in Storage Systems Windsor W. Hsuyz Alan Jay Smithz Honesty C. Youngy yStorage Systems Department zComputer Science Division Almaden Research Center EECS Department IBM Research Division University of California San Jose, CA 95120 Berkeley, CA 94720 fwindsor,[email protected] fwindsorh,[email protected] Abstract 80 Random I/O (4KB Blocks) ) s Sequential I/O Disk I/O is increasingly the performance bottleneck in r u o 60 H computer systems despite rapidly increasing disk data ( k s transfer rates. In this paper, we propose Automatic i D e r Locality-Improving Storage (ALIS), an introspective i t n 40 E storage system that automatically reorganizes selected d a disk blocks based on the dynamic reference stream to e R o t increase effective storage performance. ALIS is based e 20 m on the observations that sequential data fetch is far more i T efficient than random access, that improving seek dis- tances produces only marginal performance improve- 0 ments, and that the increasingly powerful processors and 1991 1995 1999 2003 large memories in storage systems have ample capacity Year Disk was Introduced to reorganize the data layout and redirect the accesses so as to take advantage of rapid sequential data trans- Figure 1: Time Needed to Read an Entire Disk as a fer. Using trace-driven simulation with a large set of real Function of the Year the Disk was Introduced. workloads, we demonstrate that ALIS considerably out- performs prior techniques, improving the average read between the processor and the disk continues to widen, performance by up to 50% for server workloads and by disk-based storage systems are increasingly the bottle- about 15% for personal computer workloads. We also neck in computer systems, even in personal computers show that the performance improvement persists as disk (PCs) where I/O delays have been found to highly frus- technology evolves. Since disk performance in practice trate users [25]. To make matters worse, disk recording is increasing by only about 8% per year [18], the benefit density has recently been rising by more than 50% per of ALIS may correspond to as much as several years of year [18], far exceeding the rate of decrease in access technological progress. density (I/Os per second per gigabyte of data), which has been estimated to be only about 10% per year in 1 Introduction mainframe environments [35]. The result is that al- though the disk arm is only slightly faster in each new Processor performance has been increasing by more generation of the disk, each arm is responsible for serv- than 50% per year [16] while disk access time, being ing a lot more data. For example, Figure 1 shows that limited by mechanical delays, has improved by only the time to read an entire disk using random I/O has in- about 8% per year [14, 18]. As the performance gap creased from just over an hour for a 1992 disk to almost Funding for this research has been provided by the State 80 hours for a disk introduced in 2002. of California under the MICRO program, and by AT&T Lab- Although the disk access time has been relatively sta- oratories, Cisco Corporation, Fujitsu Microelectronics, IBM, ble, disk transfer rates have risen by as much as 40% Intel Corporation, Maxtor Corporation, Microsoft Corpora- per year due to the increase in rotational speed and lin- tion, Sun Microsystems, Toshiba Corporation and Veritas ear density [14, 18]. Given the technology and industry Software Corporation. trends, such improvement in the transfer rate is likely 1 to continue, as is the almost annual doubling in storage lowed in Section 4 by a discussion of the methodology capacity. Therefore, a promising approach to increasing used to evaluate the effectiveness of ALIS. Details of effective disk performance is to replicate and reorganize some of the algorithms are presented in Section 5 and selected disk blocks so that the physical layout mirrors are followed in Section 6 by the results of our perfor- the logically sequential access. As more computing re- mance analysis. We present our conclusions in Section 7 sources become available or can be added relatively eas- and discuss some possible extensions of this work in ily to the storage system [19], sophisticated techniques Section 8. To keep the chapter focused, we highlight that accomplish this transparently, without human in- only portions of our results in the main text. More de- tervention, are increasingly possible. In this paper, we tailed graphs and data are presented in the Appendix. propose an autonomic [23] storage system that adapts itself to a given workload by automatically reorganiz- 2 Background and Related Work ing selected disk blocks to improve the spatial locality of reference. We refer to such a system as Automatic Various heuristics have been used to lay out data Locality-Improving Storage (ALIS). on disk so that items (e.g., files) that are expected to The ALIS approach is in contrast to simply clus- be used together are located close to one another (e.g., tering related data items close together to reduce the [10, 34, 36, 40]. The shortcoming of these a priori tech- seek distance (e.g., [1, 3, 43]); such clustering is not niques is that they are based on static information such very effective at improving performance since it does as the name space relationships of files, which may not not lessen the rotational latency, which constitutes about reflect the actual reference behavior. Furthermore, files 40% of the read response time [18]. Moreover, because become fragmented over time. The blocks belonging of inertia and head settling time, there is only a rela- to individual files can be gathered and laid out contigu- tively small time difference between a short seek and a ously in a process known as defragmentation [8, 33]. long seek, especially with newer disks. Therefore, for But defragmentation does not handle inter-file access ALIS, reducing the seek distance is only a secondary patterns and its effectiveness is limited by the file size effect. Instead, ALIS focuses on reducing the number which tends to be small [2, 42]. Moreover, defragmen- of physical I/Os by transforming the request stream to tation assumes that blocks belonging to the same file exhibit more sequentiality, an effect that is not likely to tend to be accessed together which may not be true for diminish over time with disk technology trends. large files [42] or database tables, and during application ALIS currently optimizes disk block layout based on launch when many seeks remain even after defragmen- the observation that only a portion of the stored data tation [25]. is in active use [17] and that workloads tend to have The posteriori approach utilizes information about long repeated sequences of reads. ALIS exploits the for- the dynamic reference behavior to arrange items. An ex- mer by clustering frequently accessed blocks together ample is to identify data items – blocks [1, 3, 43], cylin- while largely preserving the original block sequence, ders [53], or files [48, 49, 54] – that are referenced fre- unlike previous techniques (e.g., [1, 3, 43]) which fail quently and to relocate them to be close together. Rear- to recognize that spatial locality exists and end up ren- ranging small pieces of data was found to be particularly dering sequential prefetch ineffective. For the latter, advantageous [1] but in doing so, contiguous data that ALIS analyzes the reference stream to discover the re- used to be accessed together could be split up. There peated sequences from among the intermingled requests were some early efforts to identify dependent data and and then lays the sequences out physically sequentially to place them together [4, 43], but for the most part, the so that they can be effectively prefetched. By oper- previous work assumed that references are independent, ating at the level of the storage system, rather than which has been shown to be invalid for real workloads in the operating system, ALIS transparently improves (e.g., [20, 21, 45, 47]). Furthermore, the previous work the performance of all I/Os, including system generated did not consider the aggressive sequential prefetch com- I/O (e.g., memory-mapped I/O, paging I/O, file system mon today, and was focused primarily on reducing only metadata I/O) which may constitute well over 60% of the seek time. the I/O activity in a system [43]. Trace-driven simula- The idea of co-locating items that tend to be ac- tions using a large collection of real server and PC work- cessed together has been investigated in several different loads show that ALIS considerably outperforms previ- domains – virtual memory (e.g., [9]), processor cache ous techniques to improve read performance by up to (e.g., [15]), object database (e.g., [50]) etc. The basic ap- 50% and write performance by as much as 22%. proach is to pack items that are likely to be used contem- The rest of this paper is organized as follows. Sec- poraneously into a superunit, i.e., a larger unit of data tion 2 contains an overview of related work. In Sec- that is transferred and cached in its entirety.