<<

Memory Management in

By: Rohan Garg 2002134 Gaurav Gupta 2002435 Architecture Independent Memory Model

„ divided into pages „ Page size given in PAGE_SIZE macro in asm/page.h (4K for x86 and 8K for Alpha) „ The pages are divided between 4 segments „ Code, User Data, Kernel Code, Kernel Data „ In User mode, access only User Code and User Data „ But in Kernel mode, access also needed for User Data Addressing the

„ Segment + Offset = 4 GB Linear address (32 bits) „ Of this, = 3 GB (defined by TASK_SIZE macro) and kernel space = 1GB „ Linear Address converted to physical address using 3 levels

Index into Index into Index into Page Page Dir. Page Middle Page Table Offset Dir. Requesting and Releasing Page Frames

order „ alloc_pages(gfp_mask, order) :- used to request 2 contiguous page frames.

„ alloc_page(gfp_mask) :- returns the address of the descriptor of the allocated page frame. For only one page.

„ __get_free_pages(gfp_mask, order) :- returns the linear address of the first allocated page.

„ get_zeroed_page(gfp_mask) :- first invokes alloc_pages and then fills it with zeros.

„ __get_dma_pages(gfp_mask, order) :- gets page frame suitable for DMA. GFP mask

„ The flag specifies how to look for free page frames.

„ E.g. GFP_WAIT :- kernel is allowed to block the current process waiting for free page frames. Freeing page frames

„ __free_pages(page, order) :- If the count field of the descriptor is > 0, then decreases it by 1 else frees the 2order contiguous page frames.

„ free_pages(addr, order) :- Frees the single page at address = addr.

„ __free_page(page) :- Releases the page frame having page descriptor = page.

„ free_page(addr) :- Releases the page frame having address = addr. Finding a Physical Page

„ unsigned long __get_free_pages(int priority, unsigned long order, int dma) in mm/page_alloc.

„ Priority =

„ GFP_BUFFER (free page returned only if available in physical memory)

„ GFP_ATOMIC (return page if possible, do not current process)

„ GFP_USER (current process can be interrupted)

„ GFP_KERNEL (kernel can be interrupted)

„ GFP_NOBUFFER (do not attempt to reduce buffer cache) „ order says give me 2^^order pages (max is 128KB) „ dma specifies that it is for DMA purposes Page descriptor

„ Used to keep track of the current status of each page frame.

„ Some of the key fields of the structure are described below:

„ list:- contains pointers to next and previous items in a doubly in a page descriptor.

„ count:- usage reference counter for the page. A value greater than 0 implies more than one processor using the page frame.

„ flags:- describe the status of the page frame.

„ LRU:- contains pointers to the least recently used doubly linked list of pages.

„ zone:- the zone to which the page frame belongs. Buddy System Algorithm

„ Used for allocating groups of contiguous page frames and helps in solving the problem of external fragmentation.

„ All free page frames are grouped into lists of blocks containing groups of 1, 2, 4, 8,….,512 contiguous page frames.

„ If 128 contiguous page frames are required, list 128 is consulted. If not found, list 256 is consulted. If a block is found then the remaining 128 is added to the list 128. If not found list 512 is consulted and so on. Slab

„ Runs over the basic “buddy heap algorithm”.

„ It does not discard the ones allocated objects and saves them in memory, thus avoiding reinitialization.

„ Created pools of memory areas of same type called caches.

„ Caches are divided into slabs, each slab consisting of one or more contiguous page frames.

„ Slab allocator never releases page frames of an empty slab unless kernel is looking for additional free page frames. Interface between slab allocator and buddy system.

„ void *kmem_getpages(kmem_cache_t *cachep, unsinged long flags) { void *addr; flags |= cachep->gfpflags; addr = (void*) __get_free_pages(flags, cachep->gfporder); return addr; }

„ Slab allocator invokes this function to call buddy system algorithm to obtain a group of free contiguous page frames.

„ Similarly kmem_freepages() is used by the slab allocator to release a group of page frames. Process Address Space

Kernel 0xC0000000 File name, Environment

Arguments

Stack

_end bss _bss_start _edata Data _etext Code Header 0x84000000

Shared Libs Address Space Descriptor

„ mm_struct defined in the process descriptor. (in linux/sched.h)

„ This is duplicated if CLONE_VM is specified on forking. „ struct mm_struct { int count; // no. of processes sharing this descriptor pgd_t *pgd; //page directory ptr unsigned long start_code, end_code; unsigned long start_data, end_data; unsigned long start_brk, brk; unsigned long start_stack; unsigned long arg_start, arg_end, env_start, env_end; unsigned long rss; // no. of pages resident in memory unsigned long total_vm; // total # of bytes in this address space unsigned long locked_vm; // # of bytes locked in memory unsigned long def_flags; // status to use when mem regions are created struct vm_area_struct *mmap; // ptr to first region desc. struct vm_area_struct *mmap_avl; // faster search of region desc. } Memory Allocation for Kernel Segment

„ Static Memory_start = console_init(memory_start, memory_end); Typically done for drivers to reserve areas, and for some other kernel components.

„ Dynamic Void *kmalloc(size, priority), Void kfree (void *) Void *vmalloc(size), void *vmfree(void *)

Kmalloc is used for physically contiguous pages while vmalloc does not necessarily allocate physically contiguous pages Memory allocated is not initialized (and is not paged out). kmalloc() data structures sizes[] 32 64 size_descriptor 128 page_descriptor 256 512 1024

2048 bh bh 4096

8192 bh bh 16384

32768 bh bh 65536 Null 131072 Null vmalloc()

„ Allocated virtually contiguous pages, but they do not need to be physically contiguous.

„ Uses __get_free_page() to allocate physical frames.

„ Once all the required physical frames are found, the virtual addresses are created (and mappings set) at an unused part.

„ The virtual address search (for unused parts) on x86 begins at the next address after physical memory on an 8 MB boundary.

„ One (virtual) page is left free after each allocation for cushioning. vmalloc vs kmalloc

„ Contiguous vs non-contiguous physical memory

„ kmalloc is faster but less flexible

„ vmalloc involves __get_free_page() and may need to block to find a free physical page

„ DMA requires contiguous physical

„ All kernel segment pages are locked in memory (no swapping)

„ User pages can be paged out:

„ Complete block device

„ Fixed length files in a

„ First 4096 bytes are a bitmap indicating that space for that page is available for paging.

„ At byte 4086, string “SWAP_SPACE” is stored.

„ Hence, max swap of 4086*8-1 = 32687 pages = 130784KB per device or file

„ MAX_SWAPFILES specifies number of swap files or devices

„ Swap device is more efficient than swap file. Page Fault

„ Error code written onto stack, and the VA is stored in register CR2 „ do_page_fault(struct pt_regs *regs, unsigned long error_code) is now called.

„ If faulting address is in kernel segment, alarm messages are printed out and the process is terminated.

„ If faulting address is not in a area, check if VM_GROWSDOWN for the nexy virtual memory area is set (I.e. Stack). If so, expand VM. If error in expanding send SIGSEGV.

„ If faulting address is in a virtual memory area, check if protection bits are OK. If not legal, send SIGSEGV. Else, call do_no_page() or do_wp_page(). Page Replacement Algorithm

ƒ LRU – Least Recently used replacement ƒ NFU – Not Frequently Used replacement ƒ Page Ageing based replacement ƒ Working Set algorithm based on locality of references per process ƒ Working Set based clock algorithms ƒ LRU with Ageing and Working Set algorithms are efficient to use and are commonly used Page Replacement handling in

ƒ Page Cache ƒ Pages are added to the Page cache for fast lookup. ƒ Page cache pages are hashed based on their address space and page index. ƒ Inode or disk block pages, shared pages and anonymous pages form the page cache. ƒ Swap cached pages also part of the page cache represent the swapped pages. ƒ Anonymous pages enter the swap cache at swap-out time and shared pages enter when they become dirty. LRU Cache

ƒ LRU cache is made up of active lists and inactive lists. ƒ These lists are populated during page faults and when page cached pages are accessed or referenced. ƒ kswapd is the page out kernel that balances the LRU cache and trickles out pages based on an approximation to LRU algorithm. ƒ Active lists contains referenced pages. This list is monitored for Page references through refill_inactive ƒ Referenced pages are given a chance to age through Move To Front and unreferenced pages are moved to the inactive list ƒ The inactive lists contains the set of Inactive clean and inactive dirty pages. ƒ This set is monitored on a timely basis when pages_high threshold is reached for free pages on a per zone basis is crossed. Thank you