Virtual Memory and Demand Paging

UNIVERSITY OF CINCINNATI _____________ , 20 _____ I,______________________________________________, hereby submit this as part of the requirements for the degree of: ________________________________________________ in: ________________________________________________ It is entitled: ________________________________________________ ________________________________________________ ________________________________________________ ________________________________________________ Approved by: ________________________ ________________________ ________________________ ________________________ ________________________ A STUDY OF SWAP CACHE BASED PREFETCHING TO IMPROVE VIRTUAL MEMORY PERFORMANCE A thesis submitted to the Division of Research and Advanced Studies of the University of Cincinnati in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in the Department of Electrical and Computer Engineering and Computer Science of the College of Engineering 2002 by Udaykumar Kunapuli Bachelor of Engineering, Osmania University, Hyderabad, India, 1998 Committee Chair: Dr. Yiming Hu Abstract With dramatic increase in processor speeds over the last decade, disk latency has become a critical issue in computer systems performance. Disks, being mechanical devices, are orders of magnitude slower than the processor or physical memory. Most Virtual Memory(VM) systems use disk as secondary storage for idle data pages of an application. The working set of pages is kept in memory. When a page requested by the processor is not present in memory, it results in a page fault. On a page fault, the Operating System brings the requested page from the disk into memory. Thus the performance of Virtual Memory systems depends on disk performance. In this project, we aim to reduce the effect of disks on Virtual Memory performance compared to the traditional demand paging system. We study novel techniques of page grouping and prefetching to improve Virtual Memory system performance. In our system, we group pages according to their access times. Most demand paging systems use least recently used (LRU) page replacement policy. So, pages evicted from memory at one time must have been accessed in memory also at one time. Thus the order in which pages are evicted from memory is similar to the order in which they were last accessed in memory. We assume that processes access pages, on disk, in a sequence similar to the sequence in which the pages were swapped to the disk. The goal of our research is to study how far this assumption is valid and if it can be used to improve Virtual Memory performance of demand paging systems. We group pages, evicted from memory at about the same time, into a single large block. On a page fault, we prefetch the entire block along with the faulting page. We implement this grouping and prefetching scheme with a swap cache. The swap cache combines a group of pages, evicted from memory, into a superblock. Superblock is the basic unit of I/O operation during paging and swapping. Pages are evicted from memory onto the swap cache. When the swap cache is full, the superblock is written to disk in one large write. When a page needs to be read, the entire superblock that has the required page is read from the disk directly into memory. So we prefetch all pages with memory eviction locality in a single disk read. Our aim is to investigate if Virtual Memory I/O performance can be improved by predicting future access sequence from the recorded memory eviction sequence. From this study, we find that the swap cache based prefetching significantly reduces the number of read accesses to the disk. Our simulations show that the number of read accesses to the disk reduced by at least 12% for all the six SPEC 2000 benchmark applications used in this study. For some applications, the number of read accesses reduced by as much as 90%. We also find improvement in Virtual Memory I/O performance of many SPEC 2000 benchmark applications. With the swap cache, Virtual Memory performance of five of the six SPEC 2000 benchmark applications improved by at least 25%, with some improving up to 88%. Acknowledgments Firstly, I would like to thank Dr. Yiming Hu for his providing me an opportunity to work under his guidance. I also thank Dr. Hu for his support and suggestions that helped me conduct this research. I would like to thank Dr. Karen Tomko and Dr. Wen-Ben Jone for serving in my thesis committee. I would like to thank my parents, Mr. and Mrs. Prasad Rao for their love and encouragement all these years. I would like to thank my uncle Dr. Narayana for helping me in every way possible during my graduate study. I would like to thank my brother Sekhar for his support. I would like to thank my cousin Bharggavudoo for his help. I would especially like to thank my cousins Sree Giridhar and Kala for all their great help and support during my stay in Cincinnati. I would like to thank my friends at UC, especially Sudhir and Venu, for all their help. I would like to thank all my lab-mates from OSCAR Lab particularly Rui, Sohum, Swaroop and Venkat. I am honored to be in the same lab with Rui Min. I have high regard for his phenomenal intellectual abilities. I thank him for answering all my doubts patiently. I would like to thank Sohum for giving me access to the lab. I would also like to thank Swaroop and Venkat for their guidance and help. I had some great discussions with them ranging from UNIX text editors to comparison of Linux and Microsoft Windows. I would like to thank my friends from India, Sai and Ramesh for their support. Finally, I would like to thank Bram Moolenaar for designing the VIM (Vi IMproved) text editor that has made this documentation and implementation a thoroughly enjoyable task. Contents 1 Introduction 1 1.1 Background - Virtual Memory and Demand Paging . 1 1.2 Motivation . 3 1.3 Swap Cache and Page Grouping . 4 1.3.1 Large Disk Operations . 5 1.4 Organization of Thesis . 7 2 Related Work 8 3 Swap Cache Architecture 12 3.1 Page Faults . 13 3.2 Architecture and Methodology . 14 3.2.1 Page Read Operation . 15 3.2.2 Page Write Operation . 15 3.2.3 Dirty Pages . 16 i 3.3 Comparison with Previous Research . 16 4 Simulation 19 4.1 VM Trace Generation . 19 4.2 Input Benchmarks . 20 4.2.1 SPEC 2000 . 21 4.2.2 NASA Parallel (NP) Benchmarks . 23 4.3 Virtual Memory Simulator . 24 4.3.1 Input Parameters . 24 4.3.2 Output Parameters . 25 4.4 Assumptions . 26 5 Results 28 5.1 Trace Analysis . 29 5.1.1 Working Set History . 29 5.2 Virtual Memory Performance . 31 5.2.1 Disk Read Accesses . 31 5.2.2 Total I/O time . 32 5.2.3 Data Transfer Size . 33 5.3 Conclusions . 34 6 Future Work 41 ii 6.1 Virtual Memory Traces . 41 6.2 Kernel Implementation . 42 6.3 Advantage of Prefetching . 42 6.4 Swap Cache With Compressed Caching . 43 6.5 Swap Cache With TLB And Hardware Support . 43 6.6 Execution Driven Simulation . 44 A ADDITIONAL RESULTS 49 iii List of Figures 1.1 Average Disk Read Latency . 6 3.1 Average Disk Read Latency . 18 4.1 Simulation Procedure . 21 5.1 Working set history of bzip2 and wupwise . 36 5.2 Disk read accesses of bzip2 and wupwise . 37 5.3 Total I/O time of bzip2 and wupwise . 38 5.4 Total data transfer size of bzip2 and wupwise . 39 A.1 Working Set History of Benchmark Applications . 50 A.2 Working Set History of Benchmark Applications . 51 A.3 Effect of Swap Cache on VM Performance of gcc . 52 A.4 Effect of Swap Cache on VM Performance of gap . 53 A.5 Effect of Swap Cache on VM Performance of mcf . 54 iv A.6 Effect of Swap Cache on VM Performance of mesa . 55 A.7 Effect of Swap Cache on VM Performance of cgb . 56 A.8 Data Transfer of gcc and gap . 57 A.9 Data Transfer of cgb and mcf . 58 v List of Tables 5.1 Analysis of Virtual Memory Traces . 30 5.2 Optimal Swap Cache Size for SPEC and NP Benchmarks . 40 vi Chapter 1 Introduction This chapter presents the basic concepts of virtual memory and demand paging. We also present the motivations for undertaking this study. We introduce the basics of a swap cache based virtual memory system. 1.1 Background - Virtual Memory and Demand Paging New computer applications demand high level of performance from the entire computer system. With the increase in size and complexity of new software applications, the demand for better performing hardware system is getting greater and greater. Most computer systems built today use Virtual Memory(VM). Some of the many advantages are better support for multiprogramming and program size is no longer limited by the size of physical memory. In VM systems, memory is divided into fixed size blocks called pages. An area of the hard disk called swap area is used as 1 an extension of the physical memory. Most VM systems maintain only the working set of pages of an application in the physical memory, while rest of the pages are kept on the swap disk. The Operating System manages the transfer of pages between the swap disk and the physical memory. The process of bringing a page into memory only when it is needed is called demand paging [17, 4]. Demand paging works well because most programs exhibit locality of reference. Programs refer only to a set of pages, the working set, at a given time. The Operating system uses efficient LRU algorithms to move idle pages from memory to the disk. A page is again fetched from the disk when it is needed.

Virtual Memory and Demand Paging

Extracting Compressed Pages from the Windows 10 Virtual Store WHITE PAPER | EXTRACTING COMPRESSED PAGES from the WINDOWS 10 VIRTUAL STORE 2

Demand Paging from the OS Perspective

Virtual Memorymemory

Virtual Memory HW

Systemy Operacyjne Komputerów Przemysłowych OS Linux, Część 3

Caching and Demand-‐Paged Virtual Memory

Demand-‐Paged Virtual Memory

Last Class: Paging & Segmentation

Optimizing the TLB Shootdown Algorithm with Page Access Tracking

Efficient High Frequency Checkpointing for Recovery and Debugging Vogt, D

The Effect of Compression on Performance in a Demand Paging

15213 Lecture 17: Virtual Memory Concepts