Virtual Memory Algorithm Improvement
Total Page:16
File Type:pdf, Size:1020Kb
VIRTUAL MEMORY ALGORITHM IMPROVEMENT Kamleshkumar Patel B.E., Sankalchand Patel College of Engineering, 2005 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE at CALIFORNIA STATE UNIVERSITY SACRAMENTO FALL 2009 VIRTUAL MEMORY ALGORITHM IMPROVEMENT A Project by Kamleshkumar Patel Approved by: _______________________________, Committee Chair Dr. Chung-E Wang _______________________________, Second Reader Dick Smith, Emeritus Faculty, CSUS ____________________ Date: ii Student: Kamleshkumar Patel I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the thesis. __________________________, Graduate Coordinator _____________________ Dr. Cui Zhang, Ph.D. Date Department of Computer Science iii Abstract of VIRTUAL MEMORY ALGORITHM IMPROVEMENT by Kamleshkumar Patel The central component of any operating system is the Memory Management Unit (MMU). As the name implies, memory-management facilities are responsible for the management of memory resources available on a machine. Virtual memory (VM) in MMU allows a program to execute as if the primary memory is larger than its actual size. The whole purpose of virtual memory is to enlarge the address space, the set of addresses a program can utilize. A program would not be able to fit in main memory all at once when it is using all of virtual memory. Nevertheless, operating system could execute such a program by copying required portions into main memory at any given point during execution. To facilitate copying virtual memory into real memory, operating system divides virtual memory into pages. Each page contains a fixed number of addresses. Each page is stored on a disk, when the page is needed it is copied into the main memory. In order to achieve better performance, the system needs to provide requested pages quickly from memory. Tree search algorithm find requested pages from virtual page data structure. Its efficiency is highly depends on the structure and number of nodes. That is why data structure is an iv important feature of virtual memory. The splay tree and the radix tree are the most popular data structure for the current Linux operating system. FreeBSD OS uses the splay tree data structure. In some situation like prefix searching, the splay tree data structure is not the most effective data structure. As a result, the OS needs a better data structure than the splay tree to access VM pages quickly in that situation. The radix tree structure is implemented and used in place of the splay tree. The objective is efficient use of memory and faster performance. Both the data structures are used in parallel to check the correctness of newly implemented radix tree. Once the results are satisfactory, I also benchmarked the data structures and found that the radix tree gave much better performance over the splay trees when a process holds more pages. _______________________, Committee Chair Dr. Chung-E Wang ________________________ Date v TABLE OF CONTENTS Page List of Tables…………………………………………………………………………... vii List of Figures…………..……………………………………………………………… viii Chapter 1. INTRODUCTION .......................................................................................................... 1 1.1 Project objective ........................................................................................................ 2 1.2 Project plan ............................................................................................................... 2 1.3 To do list ................................................................................................................... 2 2. OVERVIEW OF DATA STRUCTURE ......................................................................... 4 2.1 The splay tree data structure ..................................................................................... 4 2.2 The radix tree data structure ..................................................................................... 8 3. OVERVIEW OF VIRTUAL MEMORY ...................................................................... 10 3.1 Virtual memory and related terms .......................................................................... 10 3.2 Overview of FreeBSD virtual memory ................................................................... 11 4. PROJECT IMPLEMENTATION ................................................................................. 15 4.1 FreeBSD installation guide ..................................................................................... 15 4.2 Useful tips ............................................................................................................... 16 4.3 Steps of project implementation ............................................................................. 17 4.3.1 Kernel debugging ............................................................................................. 18 4.3.2 Splay tree implementation ............................................................................... 21 4.3.3 The radix tree implementation ......................................................................... 23 5. PERFORMANCE MEASUREMENT ......................................................................... 25 5.1 Performance measurement algorithm ..................................................................... 25 5.2 Performance difference: splay tree vs. radix tree .................................................... 27 6. CONCLUSION ............................................................................................................. 33 Appendix …………………………………………………………………………….......34 References ……………………………………………………………………………….46 vi LIST OF TABLES Page 1. Table 5.1.1 Performance difference splay vs. radix ………………………..………27 vii LIST OF FIGURES Page 1. Figure 2.1.1: Splay tree implementation 1………………………………….…….5 2. Figure 2.1.2: Splay tree implementation 2…………………………………….….6 3. Figure 2.1.3: Splay tree rotation implementation ………………………………..7 4. Figure 2.2.1: Radix tree implementation……………………………………........9 5. Figure 3.2.1: Layout of virtual address space …………………………………..11 6. Figure 3.2.2: Data structure that describes a process address space ……………11 7. Figure 3.2.3: Layout of an address space ………………………………….........13 8. Figure 4.1.1: Installation screenshot 1. ………………………….……………...15 9. Figure 4.1.2: Installation screenshot 2. ………………………….……………...16 10. Figure 5.1.1: Layout of pages…………………………………………………...25 viii 1 Chapter 1 INTRODUCTION The FreeBSD (Berkeley Software Distribution) operating system uses the splay trees data structure to organize virtual memory pages of a process. A splay tree is a self-balancing tree. If we search for an element that we looked up recently, we will find it very quickly because it is near the top. This is a good feature because many real-world programs need to access some values much more frequently than other values. Now considering the lookup penalties with the splay data structure, it could be differentiated in two stages. The initial lookup penalty would be in search of the required key from the tree. On the average, this penalty could be O(log(N)) time, where N is the number of keys in the splay tree. However, in worst case, the penalty could grow up to O(N) time. Another penalty would be observed during the second stage of tree modifications in which the algorithm is required to perform rotations in order to move the recently accessed element to the root. The fact that a lookup performs O(log N) writes into the tree. Significantly adds to its cost on a modern hardware. Because of this reason, some developers do not consider the splay trees to be an ideal data structure and have been trying to modify the VM page data structure. 2 1.1 Project objective Virtual Memory Algorithm Improvement project uses the radix tree data structure in place of the splay tree data structure. The objective is efficient use of memory and faster performance. In order to achieve this objective, it is required to implement a better data structures to support large VM objects of FreeBSD kernel in place of the splay tree data structure. 1.2 Project plan 1. Implement a new data structure in user-space first. The new data structure is generalized radix tree. Test for memory leaks and correctness. Evaluate space and time efficiency. 2. Modify the kernel code and run the new data structure parallel to the old data structure on two separate machines and test for identical functionality. 3. Perform performance evaluation. 1.3 To do list 2. Understand the splay tree data structure and operation, radix structure and VM space management. 3. Test for memory leaks and functional correctness. 3 4. Integrate a new data structure in kernel code and run in parallel with the existing splay tree, check if the values returned are identical. 5. Remove the splay tree and measure performance for the new data structure. 4 Chapter 2 OVERVIEW OF DATA STRUCTURE Tree search algorithms are mainly used for structured data. The efficiency of a tree search is highly depends upon the number and structure of nodes in relation to the number of items on those nodes. Virtual memory needs a sophisticated search algorithm to access VM pages quickly and also need effective data structure to manage virtual pages. Currently, virtual memory uses the splay tree data structure. But in some situations, the radix tree data structure implementation has low