The Slab Allocator: an Object-Caching Kernel Memory Allocator

The Slab Allocator: An Object-Caching Kernel Memory Allocator Jeff Bonwick Sun Microsystems Abstract generally superior in both space and time. Finally, Section 6 describes the allocator’s debugging This paper presents a comprehensive design over- features, which can detect a wide variety of prob- view of the SunOS 5.4 kernel memory allocator. lems throughout the system. This allocator is based on a set of object-caching primitives that reduce the cost of allocating complex objects by retaining their state between uses. These 2. Object Caching same primitives prove equally effective for manag- Object caching is a technique for dealing with ing stateless memory (e.g. data pages and temporary objects that are frequently allocated and freed. The buffers) because they are space-efficient and fast. idea is to preserve the invariant portion of an The allocator’s object caches respond dynamically object’s initial state — its constructed state — to global memory pressure, and employ an object- between uses, so it does not have to be destroyed coloring scheme that improves the system’s overall and recreated every time the object is used. For cache utilization and bus balance. The allocator example, an object containing a mutex only needs also has several statistical and debugging features to have mutex_init() applied once — the first that can detect a wide range of problems throughout time the object is allocated. The object can then be the system. freed and reallocated many times without incurring the expense of mutex_destroy() and 1. Introduction mutex_init() each time. An object’s embedded locks, condition variables, reference counts, lists of The allocation and freeing of objects are among the other objects, and read-only data all generally qual- most common operations in the kernel. A fast ker- ify as constructed state. nel memory allocator is therefore essential. How- ever, in many cases the cost of initializing and Caching is important because the cost of con- destroying the object exceeds the cost of allocating structing an object can be significantly higher than and freeing memory for it. Thus, while improve- the cost of allocating memory for it. For example, ments in the allocator are beneficial, even greater on a SPARCstation-2 running a SunOS 5.4 develop- gains can be achieved by caching frequently used ment kernel, the allocator presented here reduced objects so that their basic structure is preserved the cost of allocating and freeing a stream head between uses. from 33 microseconds to 5.7 microseconds. As the table below illustrates, most of the savings was due The paper begins with a discussion of object to object caching: caching, since the interface that this requires will ____________________________________________ shape the rest of the allocator. The next section ⎜____________________________________________ Stream Head Allocation + Free Costs (μsec) ⎜ then describes the implementation in detail. Section ⎜ ⎜ ⎜ construction ⎜ memory ⎜other ⎜ ⎜ ⎜ ⎜ ⎜ 4 describes the effect of buffer address distribution ⎜____________________________________________ allocator + destruction allocation init. ⎜ on the system’s overall cache utilization and bus ⎜ ⎜ ⎜ ⎜ ⎜ old ⎜ 23.6 ⎜ 9.4 ⎜ 1.9 balance, and shows how a simple coloring scheme ⎜ ⎜ ⎜____________________________________________ new ⎜ 0.0 ⎜ 3.8 ⎜ 1.9 ⎜ can improve both. Section 5 compares the allocator’s performance to several other well-known Caching is particularly beneficial in a mul- kernel memory allocators and finds that it is tithreaded environment, where many of the most frequently allocated objects contain one or more foo = kmem_alloc(sizeof (struct foo), embedded locks, condition variables, and other con- KM_SLEEP); structible state. mutex_init(&foo->foo_lock, ...); cv_init(&foo->foo_cv, ...); The design of an object cache is foo->foo_refcnt = 0; straightforward: foo->foo_barlist = NULL; To allocate an object: use foo; if (there’s an object in the cache) take it (no construction required); ASSERT(foo->foo_barlist == NULL); else { ASSERT(foo->foo_refcnt == 0); allocate memory; cv_destroy(&foo->foo_cv); construct the object; } mutex_destroy(&foo->foo_lock); kmem_free(foo); To free an object: Notice that between each use of a foo object we return it to the cache (no destruction required); perform a sequence of operations that constitutes nothing more than a very expensive no-op. All of To reclaim memory from the cache: this overhead (i.e., everything other than ‘‘use foo’’ take some objects from the cache; above) can be eliminated by object caching. destroy the objects; free the underlying memory; 2.2. The Case for Object Caching in the An object’s constructed state must be initial- Central Allocator ized only once — when the object is first brought Of course, object caching can be implemented into the cache. Once the cache is populated, allo- without any help from the central allocator — any cating and freeing objects are fast, trivial operations. subsystem can have a private implementation of the algorithm described above. However, there are 2.1. An Example several disadvantages to this approach: (1) There is a natural tension between an object Consider the following data structure: cache, which wants to keep memory, and the struct foo { rest of the system, which wants that memory kmutex_t foo_lock; back. Privately-managed caches cannot handle kcondvar_t foo_cv; this tension sensibly. They have limited struct bar *foo_barlist; insight into the system’s overall memory needs int foo_refcnt; and no insight into each other’s needs. Simi- }; larly, the rest of the system has no knowledge of the existence of these caches and hence has Assume that a foo structure cannot be freed until no way to ‘‘pull’’ memory from them. there are no outstanding references to it (2) Since private caches bypass the central alloca- (foo_refcnt == 0) and all of its pending bar tor, they also bypass any accounting mechan- events (whatever they are) have completed isms and debugging features that allocator may (foo_barlist == NULL). The life cycle of a possess. This makes the operating system dynamically allocated foo would be something like more difficult to monitor and debug. this: (3) Having many instances of the same solution to a common problem increases kernel code size and maintenance costs. Object caching requires a greater degree of coopera- tion between the allocator and its clients than the standard kmem_alloc(9F)/kmem_free(9F) interface allows. The next section develops an interface to support constructed object caching in the central allocator. 2.3. Object Cache Interface whether it’s acceptable to wait for memory if The interface presented here follows from two none is currently available. observations: (3) void kmem_cache_free( (A) Descriptions of objects (name, size, alignment, struct kmem_cache *cp, constructor, and destructor) belong in the void *buf); clients — not in the central allocator. The Returns an object to the cache. The object allocator should not just ‘‘know’’ that must still be in its constructed state. sizeof (struct inode) is a useful pool Finally, if a cache is no longer needed the client can size, for example. Such assumptions are brittle destroy it: [Grunwald93A] and cannot anticipate the needs of third-party device drivers, streams modules (4) void kmem_cache_destroy( and file systems. struct kmem_cache *cp); (B) Memory management policies belong in the Destroys the cache and reclaims all associated central allocator — not in its clients. The resources. All allocated objects must have clients just want to allocate and free objects been returned to the cache. quickly. They shouldn’t have to worry about This interface allows us to build a flexible allocator how to manage the underlying memory that is ideally suited to the needs of its clients. In efficiently. this sense it is a ‘‘custom’’ allocator. However, it It follows from (A) that object cache creation must does not have to be built with compile-time be client-driven and must include a full specification knowledge of its clients as most custom allocators of the objects: do [Bozman84A, Grunwald93A, Margolin71], nor does it have to keep guessing as in the adaptive-fit (1) struct kmem_cache *kmem_cache_create( methods [Bozman84B, Leverett82, Oldehoeft85]. char *name, Rather, the object-cache interface allows clients to size_t size, specify the allocation services they need on the fly. int align, void (*constructor)(void *, size_t), void (*destructor)(void *, size_t)); 2.4. An Example Creates a cache of objects, each of size size, This example demonstrates the use of object cach- aligned on an align boundary. The align- ing for the ‘‘foo’’ objects introduced in Section 2.1. ment will always be rounded up to the The constructor and destructor routines are: minimum allowable value, so align can be void zero whenever no special alignment is required. foo_constructor(void *buf, int size) name identifies the cache for statistics and { debugging. constructor is a function that struct foo *foo = buf; constructs (that is, performs the one-time ini- tialization of) objects in the cache; destruc- mutex_init(&foo->foo_lock, ...); tor undoes this, if applicable. The construc- cv_init(&foo->foo_cv, ...); tor and destructor take a size argument so foo->foo_refcnt = 0; that they can support families of similar foo->foo_barlist = NULL; caches, e.g. streams messages. } kmem_cache_create returns an opaque descriptor for accessing the cache. void Next, it follows from (B) that clients should need foo_destructor(void *buf, int size) just

The Slab Allocator: an Object-Caching Kernel Memory Allocator

CSE 220: Systems Programming

Memory Allocation Pushq %Rbp Java Vs

Unleashing the Power of Allocator-Aware Software

P1083r2 | Move Resource Adaptor from Library TS to the C++ WP

Fast Efficient Fixed-Size Memory Pool: No Loops and No Overhead

Effective STL

Memory Allocation

Composing High-Performance Memory Allocators

A Progressive Register Allocator for Irregular Architectures

Implementing a New Register Allocator for the Server Compiler in the Java Hotspot Virtual Machine

GUARDER: a Tunable Secure Allocator

The Concept of Memory Allocator