Understanding the Linux Virtual Memory Manager

1 Understanding The Linux Virtual Memory Manager Mel Gorman 15th February 2004 Contents List of Figures 5 List of Tables 7 Acknowledgements 9 1 Introduction 12 1.1 General Kernel Literature . 13 1.2 Thesis Overview . 14 1.3 Typographic Conventions . 14 1.4 About this Document . 14 1.5 Companion CD . 15 2 Code Management 17 2.1 Managing the Source . 17 2.2 Getting Started . 23 2.3 Submitting Work . 24 3 Describing Physical Memory 26 3.1 Nodes . 27 3.2 Zones . 29 3.3 Pages . 31 3.4 High Memory . 33 4 Page Table Management 36 4.1 Describing the Page Directory . 37 4.2 Describing a Page Table Entry . 38 4.3 Using Page Table Entries . 39 4.4 Translating and Setting Page Table Entries . 42 4.5 Allocating and Freeing Page Tables . 42 4.6 Kernel Page Tables . 43 4.7 Mapping addresses to struct pages .................. 44 5 Process Address Space 47 5.1 Linear Address Space . 48 5.2 Managing the Address Space . 49 2 CONTENTS 3 5.3 Process Address Space Descriptor . 50 5.4 Memory Regions . 55 5.5 Exception Handling . 70 5.6 Page Faulting . 71 5.7 Copying To/From Userspace . 77 6 Boot Memory Allocator 80 6.1 Representing the Boot Map . 81 6.2 Initialising the Boot Memory Allocator . 81 6.3 Allocating Memory . 83 6.4 Freeing Memory . 85 6.5 Retiring the Boot Memory Allocator . 85 7 Physical Page Allocation 90 7.1 Managing Free Blocks . 90 7.2 Allocating Pages . 91 7.3 Free Pages . 94 7.4 Get Free Page (GFP) Flags . 95 7.5 Avoiding Fragmentation . 97 8 Non-Contiguous Memory Allocation 100 8.1 Describing Virtual Memory Areas . 100 8.2 Allocating A Non-Contiguous Area . 101 8.3 Freeing A Non-Contiguous Area . 102 9 Slab Allocator 104 9.1 Caches . 106 9.2 Slabs . 116 9.3 Objects . 122 9.4 Sizes Cache . 123 9.5 Per-CPU Object Cache . 125 9.6 Slab Allocator Initialisation . 128 9.7 Interfacing with the Buddy Allocator . 128 10 High Memory Management 130 10.1 Managing the PKMap Address Space . 130 10.2 Mapping High Memory Pages . 131 10.3 Mapping High Memory Pages Atomically . 133 10.4 Bounce Buﬀers . 134 10.5 Emergency Pools . 136 11 Page Frame Reclamation 138 11.1 Pageout Daemon (kswapd) . 139 11.2 Page Cache . 140 11.3 Manipulating the Page Cache . 141 11.4 Shrinking all caches . 145 CONTENTS 4 11.5 Swapping Out Process Pages . 145 11.6 Page Replacement Policy . 147 12 Swap Management 150 12.1 Describing the Swap Area . 151 12.2 Mapping Page Table Entries to Swap Entries . 154 12.3 Allocating a swap slot . 155 12.4 Swap Cache . 157 12.5 Activating a Swap Area . 158 12.6 Deactivating a Swap Area . 160 12.7 Swapping In Pages . 161 12.8 Swapping Out Pages . 162 12.9 Reading/Writing the Swap Area . 162 13 Out Of Memory Management 164 13.1 Selecting a Process . 165 13.2 Killing the Selected Process . 165 14 Conclusion 166 List of Figures 2.1 Example Patch . 20 3.1 Relationship Between Nodes, Zones and Pages . 27 4.1 Linear Address Bit Size Macros . 37 4.2 Linear Address Size and Mask Macros . 38 4.3 Page Table Layout . 40 4.4 Call Graph: paging_init() ....................... 44 5.1 Kernel Address Space . 48 5.2 Data Structures related to the Address Space . 57 5.3 Memory Region Flags . 58 5.4 Call Graph: sys_mmap2() ........................ 63 5.5 Call Graph: get_unmapped_area() ................... 64 5.6 Call Graph: insert_vm_struct() .................... 65 5.7 Call Graph: sys_mremap() ........................ 67 5.8 Call Graph: move_vma() ......................... 67 5.9 Call Graph: move_page_tables() .................... 68 5.10 Call Graph: sys_mlock() ........................ 69 5.11 Call Graph: do_munmap() ........................ 70 5.12 Call Graph: do_page_fault() ...................... 73 5.13 Call Graph: handle_mm_fault() .................... 74 5.14 Call Graph: do_no_page() ........................ 75 5.15 Call Graph: do_swap_page() ...................... 76 5.16 Call Graph: do_wp_page() ........................ 77 5.17 do_page_fault Flow Diagram . 79 6.1 Call Graph: setup_memory() ...................... 82 6.2 Call Graph: __alloc_bootmem() .................... 84 6.3 Call Graph: mem_init() ......................... 86 7.1 Free page block management . 91 7.2 Allocating physical pages . 93 7.3 Call Graph: alloc_pages() ....................... 93 7.4 Call Graph: __free_pages() ...................... 94 5 LIST OF FIGURES 6 8.1 vmalloc Address Space . 101 8.2 Call Graph: vmalloc() . 102 8.3 Call Graph: vfree() . 103 9.1 Layout of the Slab Allocator . 105 9.2 Call Graph: kmem_cache_create() . 113 9.3 Call Graph: kmem_cache_reap() . 114 9.4 Call Graph: kmem_cache_shrink() . 115 9.5 Call Graph: __kmem_cache_shrink() . 115 9.6 Call Graph: kmem_cache_destroy() . 116 9.7 Page to Cache and Slab Relationship . 117 9.8 Slab With Descriptor On-Slab . 118 9.9 Slab With Descriptor Oﬀ-Slab . 119 9.10 Call Graph: kmem_cache_grow() . 120 9.11 Initialised kmem_bufctl_t Array . 120 9.12 Call Graph: kmem_slab_destroy() . 122 9.13 Call Graph: kmem_cache_alloc() . 123 9.14 Call Graph: kmalloc() . 125 9.15 Call Graph: kfree() . 125 10.1 Call Graph: kmap() ............................131 10.2 Call Graph: kunmap() . 133 10.3 Call Graph: create_bounce() . 134 10.4 Call Graph: bounce_end_io_read/write() . 135 10.5 Acquiring Pages from Emergency Pools . 137 11.1 Call Graph: kswapd() . 139 11.2 Call Graph: generic_file_read() . 142 11.3 Call Graph: add_to_page_cache() . 143 11.4 Call Graph: shrink_caches() . 146 11.5 Call Graph: swap_out() . 147 11.6 Page Cache LRU List . 148 12.1 Storing Swap Entry Information in swp_entry_t . 155 12.2 Call Graph: get_swap_page() . 156 12.3 Adding a Page to the Swap Cache . 158 12.4 Call Graph: add_to_swap_cache() . 159 12.5 Call Graph: sys_writepage() . 162 13.1 Call Graph: out_of_memory() . 164 List of Tables 1.1 Kernel size as an indicator of complexity . 12 3.1 Flags Describing Page Status . 34 3.2 Macros For Testing, Setting and Clearing Page Status Bits . 35 4.1 Page Table Entry Protection.

Understanding the Linux Virtual Memory Manager

Better Performance Through a Disk/Persistent-RAM Hybrid Design

Virtual Memory

Linux Physical Memory Analysis

Redhawk Linux User's Guide

Chapter 3. Booting Operating Systems

ELF1 7D Virtual Memory

Isolation, Resource Management, and Sharing in Java

FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems

Improving the Performance of Hybrid Main Memory Through System Aware Management of Heterogeneous Resources

A Hybrid Swapping Scheme Based on Per-Process Reclaim for Performance Improvement of Android Smartphones (August 2018)

An Evolutionary Study of Linux Memory Management for Fun and Profit Jian Huang, Moinuddin K

A Comparison of Kernel Memory Protection for Docker Containers Across Host Operating Systems