University of New Mexico

Memory Virtualization: The Memory API

Prof. Patrick G. Bridges

1 University of New Mexico Memory API: malloc() #include

void* malloc(size_t size)

 Main programmer-level API in C.  Language-level interface, not the actual OS interface  Allocate a memory region on the heap. ▪ Argument ▪ size_t size : size of the memory block(in bytes) ▪ size_t is an unsigned integer type. ▪ Return ▪ Success : a void type pointer to the memory block allocated by malloc ▪ Fail : a null pointer  free/calloc/realloc also exist in the same vein

2 University of New Mexico Lots of ways to go wrong with memory

 Some sample things we’ve all done ▪ Not copying enough data (e.g. terminating nulls) ▪ Not allocating space for the data copied ▪ Not freeing data (memory leaks) ▪ Accessing data after its freed ▪ Freeing an area multiple times

 What does each of these actually do?

 Requires understanding how the language API is built on top of the OS memory API

3 University of New Mexico System Calls #include

int brk(void *addr) void *sbrk(intptr_t increment);

 malloc

 malloc library call use brk and/or system calls. ▪ brk is called to expand the program’s break. ▪ break: The location of the end of the heap in address space ▪ sbrk is an additional call similar with brk. ▪ Programmers should never directly call either brk or sbrk.

 What does this actually do?

4 University of New Mexico How do brk and sbrk work?

Address Space  The greyed-out area of an address 0x400000 Code space is not actually allocated. (Text) 0x401000  To use it, the OS has to make it Data 0xcf2000 available. Heap 0xd13000  Brk/sbrk sets the location of the

boundary between allocated heap heap memory and unallocated memory!

(free)

stack

0x7fff9ca28000 Stack 0x7fff9ca49000

5 University of New Mexico System Calls(Cont.) #include

void *mmap(void *ptr, size_t length, int port, int flags, int fd, off_t offset)

▪ mmap can create an anonymous memory region.

6 University of New Mexico What about mmap?

Address Space  mmap lets the program request finer- 0x400000 Code grain allocation of parts of its address (Text) 0x401000 space Data 0xcf2000  More than just moving the program Heap break 0xd13000

 Note that the address space is now heap disjoint! mmap region  mmap can also do implicit file I/O – (free) that’s a later topic… stack

0x7fff9ca28000 Stack 0x7fff9ca49000

7 University of New Mexico How are malloc/new implemented?

 The language runtime (libc) gets memory in chunks from the operating system ▪ Using mmap or sbrk – 4k or more at a time ▪ (Why not in smaller pieces?)  The language runtime divides these big blocks up to satisfy malloc/new requests ▪ Basic data structure is a “free list”, a linked list of free chunks of memory ▪ Malloc/new searches list to find a chunk to satisfy an allocation request ▪ Free returns things to this list  Important questions ▪ Where are the pointers for the linked lists stored? ▪ What block do you use to satisfy an allocation request?

8 University of New Mexico Fragmentation: Storage Virtualization Enemy #1

 We generally have to divide big blocks into smaller blocks, and there’s rarely an exact fit ▪ Variable-size allocations can result in with small, hard-to-allocate blocks ▪ Sometimes have to allocate space bigger than we want to  Wasted space from storage allocation is called fragmentation ▪ Internal fragmentation – wasted space that’s allocated but not used. (You want 5 bytes but we have to allocate 8) ▪ External fragmentation – small bits we can’t allocate

 How do we allocate storage to handle fragmentation?

9