Memory Management
Total Page:16
File Type:pdf, Size:1020Kb
Memory Management Advanced Operating System Lecture 5 Colin Perkins | https://csperkins.org/ | Copyright © 2017 | This work is licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. Lecture Outline • Virtual memory • Layout of a processes address space • Stack • Heap • Program text, shared libraries, etc. • Memory management Colin Perkins | https://csperkins.org/ | Copyright © 2017 2 Virtual Memory (1) • Processes see virtual addresses – each process believes it has access to the entire address space • Kernel programs the MMU (“memory management unit”) to translate these into physical addresses, representing real memory – mapping changed each context switch to another process Main memory! CPU chip! 0:! Virtual! Address! Physical! 1:! address! translation! address! 2:! (VA)! (PA)! 3:! CPU! MMU! 4:! 4100 4 5:! 6:! 7:! ... ! Source: Bryant & O’Hallaron, “Computer Systems: A Programmer’s Perspective”, 3rd Edition, Pearson, 2016. Fig. 9.2. http://csapp.cs.cmu.edu/ M-1:! 3e/figures.html (website grants permission for lecture use with attribution) Data word! Colin Perkins | https://csperkins.org/ | Copyright © 2017 3 Virtual Memory (2) Virtual address spaces! Physical memory! • Virtual-to-physical memory mapping 0! 0! Address translation! can be unique or shared VP 1! Process i:! VP 2! • Unique mappings typical – physical memory N-1! owned by a single process Shared page! • Shared mappings allow one read-only copy 0! Process j:! VP 1! of a shared library to be accessed by many VP 2! processes N-1! M-1! Source: Bryant & O’Hallaron, “Computer Systems: A Programmer’s Perspective”, • Memory protection done at page level – each 3rd Edition, Pearson, 2016. Fig. 9.9. http://csapp.cs.cmu.edu/3e/figures.html page can be readable, writable, executable… (website grants permission for lecture use with attribution) • Mapping is on granularity of pages VIRTUAL ADDRESS! n-1! p-1! 0! • Virtual to physical mapping typically a VPN 1! VPN 2! ...! VPN k! VPO! multi-level process Level 1! Level 2! Level k! page table! page table! ...! ...! page table! • Page tables organised as a tree; only entries that are populated, and their ancestors, exist PPN! to save space m-1! p-1! 0! • Translation managed by the kernel – invisible PPN! PPO! to applications PHYSICAL ADDRESS! Source: Bryant & O’Hallaron, “Computer Systems: A Programmer’s Perspective”, 3rd Edition, Pearson, 2016. Fig. 9.18. http://csapp.cs.cmu.edu/3e/figures.html Colin Perkins | https://csperkins.org/ | Copyright © 2017 (website grants permission for lecture use with attribution) 4 Layout of a Processes Address Space • Typical layout of process address space, in 0xFFFFFFFF virtual memory Kernel Space • Program text, data, and global variables at bottom of address space; heap follows, growing upwards 0xC0000000 • Kernel at top of address space; stack grows down, Stack ⬇ below kernel • Memory mapped files and shared libraries Memory Mapped Files and Libraries ⬇ between these Heap ⬆ BSS Segment Data Segment Program Text 0x00000000 Typical addresses on 32 bit machines See also http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory/ Colin Perkins | https://csperkins.org/ | Copyright © 2017 5 Program Text, Data, and BSS • Compiled machine code of the program text 0xFFFFFFFF typically occupies bottom of address space Kernel Space • Lowest few pages above address zero reserved to trap null-pointer dereferences – access will trigger 0xC0000000 kernel trap, killing process with segmentation fault Stack ⬇ • Program text is marked read-only • Data segment contains variables initialised in Memory Mapped Files and Libraries ⬇ the source code • E.g., static char *hello = “Hello, world!”; Heap ⬆ at the top level of a C program would allocate space in the data segment BSS Segment • Data segment is known at compile time, loaded along with program text Data Segment Program Text • BSS segment holds space for uninitialised 0x00000000 global variables • BSS is “block started by symbol”; name is historical relic • Initialised to zero by the runtime when the program loads (C standard requires this for uninitialised static variables) Colin Perkins | https://csperkins.org/ | Copyright © 2017 6 The Stack • Stack holds function parameters, return 0xFFFFFFFF address, and local variables Kernel Space • Starts at high address in memory • Each function called pushes data onto stack, growing 0xC0000000 stack downwards towards lower memory addresses Stack ⬇ • Parameters for the function; return address; pointer to previous stack frame; local variables Memory Mapped Files and Libraries ⬇ • When the function returns, that data is removed from the stack, which shrinks • The stack is managed automatically Heap ⬆ The operating system generates the stack frame for • BSS Segment main() when the program starts Data Segment • The compiler generates the code to manage the stack Program Text when it compiles the program text 0x00000000 • The calling convention for functions – how parameters are pushed onto the stack – is standardised for a given processor and programming language • Ownership tracks function execution Colin Perkins | https://csperkins.org/ | Copyright © 2017 7 Function Calling Conventions • Example: code and contents of stack while calling printf() function #include <stdio.h> Arguments to main(): int argc int char *argv Address to return to after main() main(int argc, char *argv[]) { Local variables for main() char greeting[] = “Hello”; char greeting[] if (argc == 2) { Arguments for printf() printf(“%s, %s\n”, greeting, argv[1]); char *format return 0; char *greeting } else { char *argv[1] Address to return to after printf(“usage: %s <name>\n”, argv[0]); printf() Address to previous stack frame return 1; } Local variables for printf() } … • The address of the previous stack frame is stored for ease of debugging, so stack trace can be printed, and so it can easily be restored when function returns Colin Perkins | https://csperkins.org/ | Copyright © 2017 8 Buffer Overflow Attacks • Classic buffer overflow attack: • Language runtime doesn’t check array bounds • Writing too much data overflows the space Arguments to main(): allocated to local variables, overwrites function int argc char *argv return address, and following data Address to return to after main() • Contents are valid machine code; the overwritten function return address is made to point to that Local variables for main() code char greeting[] • When function returns, code written during the Arguments for printf() overflow is executed char *format char *greeting char *argv[1] Address to return to after printf() • Solve by marking stack as non-executable, Address to previous stack frame or randomising start address of stack each Local variables for printf() time program runs … • Or use a language that enforces array bounds Local variables stored on stack checks immediately preceding function return address – overflowing local variable • Various more complex buffer overflow attacks still space will overwrite return address possible – e.g., see “return-oriented programming” Colin Perkins | https://csperkins.org/ | Copyright © 2017 9 The Heap • Explicitly allocated memory located in the heap 0xFFFFFFFF • In C, memory allocated using malloc()/calloc() Kernel Space • In Java, objects allocated using new 0xC0000000 • … Stack ⬇ • Starts at a low address in memory • Consecutive allocations placed at increasing Memory Mapped Files and Libraries ⬇ addresses in memory Usually consecutive addresses, although some types • Heap ⬆ of processor require allocations to be aligned to a 32 or 64 bit boundary BSS Segment • Modern malloc() implementations typically thread Data Segment aware, and can use different regions of the heap for different threads, to avoid cache sharing Program Text 0x00000000 • NUMA systems complicate heap allocation • Memory management primarily concerned with managing the heap Colin Perkins | https://csperkins.org/ | Copyright © 2017 10 Memory Mapped Files • The mmap() system call manipulates virtual 0xFFFFFFFF memory mappings Kernel Space • Common use: allows files to be mapped into virtual memory 0xC0000000 Stack ⬇ • Returns a pointer to a memory address that acts as a proxy for the start of the file • Reads from/writes to subsequent addresses acts on Memory Mapped Files and Libraries ⬇ the underlying file File is demand paged from/to disk as needed – only • ⬆ the parts of the file that are accessed are read into Heap memory (granularity depends on virtual memory system – often 4k pages) BSS Segment • Useful for random access to parts of files Data Segment Program Text • Can also be used to establish mappings for 0x00000000 new virtual address space – i.e., to allocate memory • In modern Unix-like systems, this is how malloc() gets memory from the kernel Colin Perkins | https://csperkins.org/ | Copyright © 2017 11 Shared Libraries • Code for shared libraries also mapped into 0xFFFFFFFF memory between heap and stack Kernel Space • Virtual memory system enforces these are read-only 0xC0000000 Stack ⬇ • One copy only in physical memory; virtual Memory Mapped Files and Libraries ⬇ address can differ for each process Heap ⬆ BSS Segment Data Segment Program Text 0x00000000 Colin Perkins | https://csperkins.org/ | Copyright © 2017 12 The Kernel • Operating system kernel owns top part of the 0xFFFFFFFF address space Kernel Space • Not accessible to user-space