Engineering Heap Overflow Exploits with JavaScript

Mark Daniel Jake Honoroff Independent Security Evaluators Independent Security Evaluators Charlie Miller Independent Security Evaluators

Abstract low for attacker control over the target heap. In this pa- per we describe a new technique, inspired by his Heap This paper presents a new technique for exploiting heap Feng Shui, that can be used to reliably position function overflows in JavaScript interpreters. Briefly, given a heap pointers for later smashing with a heap overflow. overflow, JavaScript commands can be used to insure that a function pointer is reliably present for smashing, just This paper contains a description of the technique fol- after the overflown buffer. A case study serves to high- lowed by an account of its application to a WebKit vul- light the technique: the exploit that the authors nerability discovered by the authors and used to win the used to win the 2008 CanSecWest Pwn2Own contest. 2008 CanSecWest Pwn2Own contest.

1 Introduction 2 Technique Many buffer and integer overflow vulnerabilities allow for a somewhat arbitrary set of values to be written at a relative offset to a pointer on the heap. Unfortunately for 2.1 Context the attacker, often the data following the pointer is un- predictable, making exploitation difficult and unreliable. Broadly, this technique can be used to develop client side The most ideal heap overflow, in terms of full attacker exploits against web browsers that use JavaScript. In this control over the quantity and values of overflow bytes, context, the attacker crafts a web page containing(among can be virtually unexploitable if nothing interesting and other things) JavaScript commands, and induces the vic- predictable is waiting to be overwritten. tim to browse the page. Using particular JavaScript com- Thanks to safe unlinking protections, the heap meta- mands, the attacker influences the state of the heap in data structures are often no longer a viable target for the victim’s browser process to arrange for a successful overflows. Currently, application specific data is usually attack. needed as an overflowtarget, where normal programflow results in the calling of a function pointer that has been More specifically, the technique is applicable when overwritten with an attacker supplied shellcode address. one has a heap overflow in hand. In what follows, we However, such exploits are in no way guaranteed to be refer to the buffer that is overflown as the vulnerable reliable. It must be the case that pointers yet to be ac- buffer. This heap overflow must have the property that cessed are sitting on the heap after the overflown buffer, both allocation of the vulnerable buffer and the over- and no other critical data or unmapped memory lies in flow itself must be triggerable within the JavaScript in- between, the smashing of which would result in a pre- terpreter. In particular, this technique will not apply in mature crash. Such ideal circumstances can certainly be situations where the vulnerable buffer has already been rare for an arbitrary application vulnerability. allocated before the JavaScript interpreter has been in- However, given access to a client-side scripting lan- stantiated. guage such as JavaScript, an attacker may be able to cre- We also assume that shellcode is available and a mech- ate these ideal circumstances for vulnerabilities in appli- anism for loading it into memory has already been found. cations like web browsers. In [2], Sotirov describes how This is trivial with JavaScript – just load it into a big to use JavaScript allocations in to al- string. Figure 1: Fragmented heap.

Figure 2: Defragmented heap. Future allocations end up adjacent.

2.2 Overview browser) is unpredictable. Generally, such a heap is frag- mented in that it has many holes of free memory. The Keep in mind that the goal is to control a buffer in the presence of varying sized holes of free memory means heap immediately following the vulnerable buffer. We that the addresses of successive allocations of similarly accomplish this by arranging the heap so that all holes in sized buffers are unlikely to have any reliable relation- it that are big enough to hold the vulnerable buffer are ship. Figure 1 shows how an allocation in a fragmented surrounded by buffers that we control. heap might look. Since it is impossible to predict where The technique consists of five steps. these holes might occur in a heap that has been in use 1. Defragment the heap. for some time, it is impossible to predict where the next allocation will occur. 2. Make holes in the heap. However, if we have some control over the target ap- plication, we may be able to force the application to 3. Prepare the blocks around the holes. make many allocations of any given size. In particular, we can make many allocations that will essentially fill in 4. Trigger allocation and overflow. all the holes. Once the holes are filled up, i.e. the heap is defragmented, allocations of similar size will, in general, 5. Trigger the jump to shellcode. be predictably close to each other, as in Figure 2. These steps are described in more detail in the rest of this We emphasize that defragmentation is always with re- section. spect to a particular buffer size. In preparation for the exploit, we must defragment the heap with respect to the size of the vulnerable buffer. The size of this buffer will 2.3 Defragment vary, but should be known. As shown in [2], arranging The state of a process’s heap dependson the history of al- for defragmentation is simple in JavaScript. For exam- locations and deallocations that have occurred during the ple, process’s lifetime. Consequently, the state of the heap var bigdummy = new Array(1000); for a long running multithreaded process (e.g. a web for(i=0; i<1000; i++){

2 Figure 3: Defragmented heap with many allocations. We see a long line of same-sized buffers that we control.

Figure 4: Controlled heap with every other buffer freed. The allocation of the vulnerable buffer ends up in one of the holes.

bigdummy[i] = new Array(size); we have reached the point where all subsequent alloca- } tions are occurring contiguously at the end of the heap. Unfortunately, at this point, we have a problem. Sim- In the code snippet above, each call to ply deleting an object in JavaScript does not imme- new Array(size) results in an allocation of diately result in the object’s space on the heap be- 4*size+8 bytes on the heap.1 This allocation corre- ing freed. Space on the heap is not relinquished until sponds to an ArrayStorage object consisting of an garbage collection occurs. Internet Explorer provides eight byte header followed by an array of size pointers. a CollectGarbage() method that immediately trig- Initially, the entire allocation is zeroed out. A value of gers garbage collection [2], but other implementations do size should be chosen so that the resulting allocations not. In particular, WebKit does not. Accordingly, we di- are as close as in size as possible, and no smaller, than gress with a discussion of garbage collection in WebKit. the size of the vulnerable buffer. The value of 1000 used above was determined empirically. Our inspection of the WebKit source code turned up three main events that can trigger garbage collection. To understand these events, one must be familiar with the 2.4 Make Holes way WebKit’s JavaScript manages objects. Remember, our goal is to arrange for control of the buffer The implementation maintains two structures, that follows the vulnerable buffer. Assuming defragmen- primaryHeap and numberHeap, each of which is tation worked, we should have a number of contiguous essentially an array of pointers to CollectorBlock buffers at the end of the heap, all roughly the same size objects. A CollectorBlock is a fixed–sized array of as the (yet to be allocated) vulnerable buffer (Figure 3). Cells and each Cell can hold a JSCell object (or The next step is to free every other of these contiguous derivative). Every JavaScript object occupies a Cell buffers, leaving alternating buffers and holes, all match- in one of these heaps. Large objects (such as arrays ing the size of the vulnerable buffer. The first step in and strings) occupy additional memory on the system achieving this is heap. We refer to this additional system heap memory as for(i=900; i<1000; i+=2){ associated storage. delete(bigdummy[i]); Each CollectorBlock maintains a linked list of } those Cells that are free. When an allocation is re- The lower limit in the for loop is based on the assump- quested, and no free Cells are available in any exist- tion that after 900 allocations in the defragment stage, ing CollectorBlocks, a new CollectorBlock is allocated. 1Clearly, details such as this depend on the particular implementa- tion of JavaScript. We used WebKit release 31114 for this investiga- All JavaScript objects deriving from JSObject are tion. allocated on the primaryHeap. The numberHeap

3 before overflow

free block

ArrayStorage 1 0 NumberInstance VFT pNI pVFT pF[0] _ZN3KJS14NumberInstanceD1EV pF[1] _ZN3KJS14NumberInstanceD0EV

. . pF[2] _ZNK3KJS8JSObject4typeEv . free block

Figure 5: Details of an attacker controlled block just before the overflow is triggered. is reserved for NumberImp objects. Note that the lat- web browsing of pages with JavaScript. Since the ter do not correspond to JavaScript Number objects, but numberHeap CollectorBlock has 4062 Cells, instead are typically short lived number objects corre- JavaScript code that allocates at least this many sponding to intermediate arithmetic computations. NumberImps should trigger garbage collection. As When garbage collection is triggered, both heaps are an example, manipulation of doubles is one - tion that results in a NumberImp allocation from the examined for objects with zero reference counts, and numberHeap, so the following JavaScript code can be such objects (and their associated storage if any) are used for garbage collection triggering: freed. We now return to the three events we foundthat trigger for(i=0; i<4100; i++){ garbage collection. They are a = .5; } 1. Expiration of a dedicated garbage collection timer. With the completion of this code, the heap should appear 2. An allocation request that occurs when all of the as in Figure 4, and we are ready for the next step. particular heap’s CollectorBlocks are full. 3. Allocation of an object with sufficiently large asso- 2.5 Prepare the Blocks ciated storage (or a number of such objects). This step is straightforward. We use the following The first of these events is not very useful, because JavaScript. JavaScript has no sleep routine, and any kind of size- for(i=901; i<1000; i+=2){ able delay in the script runs the risk of causing the “Slow bigdummy[i][0] = new Number(i); Script” dialog to appear. } The last of these events depends on the number of ob- jects in the primaryHeap and the sizes of their asso- The code bigdummy[i][0] = new Number(i), ciated storage. Experiments suggest that the state of the creates a new NumberInstance object, and stores a primaryHeap varies greatly with the number of web pointer to this object in the ArrayStorage object cor- pages open in the browser and the degree to which these responding to bigdummy[i]. Figure 5 depicts a por- pages use JavaScript. Consequently, triggering reliable tion of the heap after this JavaScript runs. garbage collection with this event is difficult. On the other hand, the numberHeap appears to 2.6 Trigger Allocation and Overflow be relatively insensitive to these variations. Heuris- tically, the numberHeap maintains only one al- Now it is time to allocate the vulnerable buffer. If the located CollectorBlock, even during significant previous steps have gone as expected, the allocation for

4 eto h vro st vrrt the overwrite ob- to The is overflow. overflow the the for that of ready holes are ject the we of and one in created, up we end will buffer vulnerable the o h lc htimdaeyflostevulnerable the to follows transferred immediately is execution that buffer, invoked. block is that the table function For virtual the in method the of oc alt ita ehdo h underlying execu- the overwritten, not to of were transferred is that method tion blocks the virtual For need a tation. we to specifically, More call NumberInstance a above. blocks force the interact- of simply tion by the executed with is ing shellcode to jump Shellcode The to Jump Trigger 2.7 6. Figure not in is depicted as sled heap look the NOP overflow, should and typical allocation discussed a After be that here. appropriate will note now, sled for the but about below, for Details sled the in shellcode. address the an be should value new The buffer. the o each for dealing of way simple a it. shows with follows that study case of dereference double } i+=2){ i<1000; for(i=901; shellcode. the of execution gers ArrayStorage h olwn aacitfre ita ucincall function virtual a forces JavaScript following The ArrayStorage ouetwiebgum[]0 "); "

Breakpoint 3, 0x95850389 in 4 Conclusions KJS::ArrayInstance::ArrayInstance () array buffer at$1001 = 0x17168000 The technique described in this paper allowed for reliable exploitation of a buffer overflow that initially had no pre- Breakpoint 3, 0x95850389 in dictable and interesting data to overwrite. While some at- KJS::ArrayInstance::ArrayInstance () tacker control is necessary, such as allocation size, over- array buffer at$1002 = 0x17169000 flow size, and overflow data, this technique should be applicable to other browser vulnerabilities when the at- After freeing up a bunch of holes at the end us- tacker has access to JavaScript. We suspect that simi- ing the garbage collection technique described in Sec- lar techniques may be applicable given access to other tion 2.4, we see the vulnerable buffer landing in the last hole at 0x17168000, right before data we control at client-side scripting languages. 0x17169000.

Breakpoint 2, 0x95846748 in References jsRegExpCompile () regex buffer at$1004 = 0x17168000 [1] Flake, H. Third Generation Exploitation. Blackhat Briefings Windows 2002. So we will now overflow our regex buffer into the ArrayStorage data. The bytes of the overflow have [2] Sotirov, A. Heap Feng Shui in JavaScript. Blackhat to be compiled regular expression bytes. Luckily, the Europe 2007. regular expression character class construct allows for

6