A Sense of Time for Javascript and Node.Js: First-Class Timeouts As a Cure for Event Handler Poisoning

A Sense of Time for JavaScript and Node.js: First-Class Timeouts as a Cure for Event Handler Poisoning #24, Anonymous Abstract tecture — from the One Thread Per Client Architecture The software development community has begun to (OTPCA) used in Apache to the Event-Driven Architec- adopt the Event-Driven Architecture (EDA) to pro- ture (EDA) championed by Node.js. vide scalable web services, most prominently through The EDA is not a passing fad. Perhaps inspired Node.js. Though the EDA offers excellent scalability, by Welsh et al.’s SEDA concept [99], server-side EDA it comes with an inherent risk: the Event Handler Poi- frameworks like Twisted [27] have been in use since at soning (EHP) Denial of Service attack. In essence, EHP least the early 2000s. But the boom in the EDA has come attacks say that the scalability of the EDA is a double- with Node.js. Node.js (“server-side JavaScript”) was in- edged sword: the EDA is scalable because clients share troduced in 2009 and is now used by many organizations a small set of threads, but if an attacker causes these in many industries, including IBM, Microsoft, Apple, threads to block then the server becomes unusable. EHP Cisco, Intel, RedHat, GE, Siemens, NASA, Wikipedia, attacks are a significant threat, as hundreds of popular and the NFL ([34, 67, 81, 77, 38, 18]). The associated Node.js-based websites were recently reported to be vul- package ecosystem, npm, boasts 575,000 modules [12]. nerable to forms of EHP attacks. It is not an exaggeration to state that Node.js is becoming In this work we make three contributions. First, we a critical component of the modern web [20, 37]. formally define EHP attacks, and we show that EHP at- Given the importance of the EDA to the modern web, tacks are a common form of vulnerability in the largest it has received surprisingly little study from a security EDA community, the Node.js ecosystem. Second, we perspective [53, 78]. We present the first formal treat- design and evaluate an EHP-safe defense: first-class ment of the weaknesses in the EDA, and describe a De- timeouts. Although realizing first-class timeouts is dif- nial of Service attack, Event Handler Poisoning (EHP), ficult in a complex real-world framework like Node.js, that can be used against EDA-based services like Node.js our Node.cure prototype defends Node.js applications applications (x3). against all known EHP attacks. Third, we have identified EHP attacks observe that the source of the EDA’s scal- and documented or corrected vulnerable APIs in Node.js, ability is also its Achilles’ heel. Where the OTPCA gives and our guide on avoiding EHP attacks is now available every client its own thread, the EDA multiplexes many on nodejs.org. clients onto a small number of Event Handlers (threads) Node.cure is effective, defeating all known EHP at- to reduce per-client overheads. Because many clients tacks with application overheads as low as 0%. More share the same Event Handlers, an EDA-based server generally, we show that Node.cure offers strong security must correctly implement cooperative multitasking [90]. guarantees against EHP attacks for the majority of the An incorrect implementation of cooperative multitasking Node.js ecosystem. enables an EHP attack: an attacker dominates the time Millions of developers have embraced the EDA, but spent by an Event Handler, preventing the server from without strict adherence to the EDA paradigm their code handling all clients fairly. We found that although the is vulnerable to EHP attacks. Node.cure offers the cure. EHP problem is little discussed in the Node.js documen- tation, EHP vulnerabilities are common in npm (x4). 1 Introduction In x5 we define criteria for EHP-safety, and use these Web services are the lifeblood of the modern Internet. To criteria to evaluate four EHP defenses (x6). Though vari- minimize costs, service providers want to maximize the ations of three of these defenses are used today, they are number of clients each server can handle. Over the past ad hoc and thus impractical in an ecosystem dominated decade, this goal has led the software community to se- by third-party modules. The fourth defense, the Timeout riously consider a paradigm shift in their software archi- Approach, requires extending existing EDA languages 1 and frameworks, but it provides a universal solution for Loop. The Event Loop may offload expensive Tasks EHP-safety with strong security guarantees. The Time- like file I/O to the queue of a small Worker Pool, whose out Approach makes timeouts a first-class member of an Workers execute Tasks and generate “Task Done” events EDA framework, securing both types of Event Handlers for the Event Loop when they finish [61]. We refer to the with a universal timeout mechanism. Event Loop and the Workers as Event Handlers. Our Node.cure prototype (x7) demonstrates the Time- out Approach in the complex Node.js framework, com- pletely defending real applications against EHP attacks. Node.cure spans the entire Node.js stack, modifying the Node.js JavaScript engine (V8, C++), core modules (JavaScript and C++), and the EDA mechanism (libuv, C), and can secure real applications with low overhead (x8). We feel our work is timely, as Staicu and Pradel recently reported that hundreds of popular websites are vulnerable to one type of EHP attack with minimal attacker effort [93]; Node.cure defeats this and all other EHP attacks. Figure 1: This is the AMPED Event-Driven Architecture. Incoming In summary: events from clients A and B are stored in the event queue, and the as- 1. We formally define Event Handler Poisoning (EHP) sociated Callbacks (CBs) will be executed sequentially by the Event (x3), a DoS attack against the EDA. We systemati- Loop. We will discuss B’s EHP attack (CBB1), which has poisoned the cally demonstrate that EHP attacks are common in the Event Loop, in x3.3. largest EDA community, the Node.js ecosystem (x4). 2. We describe a general antidote for Event Handler Because the Event Handlers are shared by all clients, Poisoning attacks: first-class timeouts. We demon- the EDA has a particular development paradigm. Each strate effectiveness with our Node.cure prototype for Callback and Task is guaranteed atomicity: once sched- Node.js in x7. We evaluate the security guarantees of uled, it runs to completion. Atomicity calls for coopera- timeouts (strong, x6) and their costs (small, x8). tive multitasking [90], and developers partition the gen- 3. Our findings have been corroborated by the Node.js eration of responses into multiple stages. The effect of community. Our guide on EHP-safe techniques is on this partitioning is to regularly yield to other requests by nodejs.org, and our PRs documenting and improving deferring work until the next stage [54]. This partition- unsafe Node.js APIs have been merged (x9). ing results in a Lifeline [41], a DAG describing the parti- tioned steps needed to complete an operation. A Lifeline 2 Background can be seen by following the arrows in Figure 1. In this section we review the EDA (x2.1), explain our 2.2 Node.js among other EDA frameworks choice of EDA framework for study (x2.2), and introduce Examples of EDA frameworks include Node.js directly related work (sections 2.3 and 2.4). (JavaScript) [15], libuv (C/C++) [10], Vert.x (Java) [28], 1 2.1 Overview of the EDA Twisted (Python ) [27], and Microsoft’s P# [57]. These frameworks have been used to build a wide There are two paradigms for web servers, distinguished variety of industry and open-source services (e.g. by the ratio of clients to resources, which corresponds [7, 81, 67, 77, 32, 31, 8, 4]). to an isolation-performance tradeoff. The One Thread Most prominent among these frameworks is Node.js, a Per Client Architecture (OTPCA) dedicates resources to server-side event-driven framework for JavaScript intro- each client, for strong isolation but higher memory and duced in 2009. The popularity of Node.js comes from its context-switching overheads [83]. The Event-Driven Ar- promise of “full stack JavaScript” — client- and server- chitecture (EDA) tries the opposite approach, with many side developers can speak the same language and share clients sharing execution resources: client connections the same libraries. This vision has driven the rise of the Event Loop are multiplexed onto a single-threaded , with JavaScript package ecosystem, npm, which with 575,000 Worker Pool a small for expensive operations. modules is the largest of any language [12]. Node.js is All mainstream server-side EDA frameworks use the still accelerating: use doubled between 2016 and 2017, Asymmetric Multi-Process Event-Driven (AMPED) ar- from 3.5 million developers [33] to 7 million [35]. chitecture [82]. This architecture (hereafter “the EDA”) The Node.js codebase has three major parts [63], is illustrated in Figure 1. In the EDA, the OS or a frame- whose interactions complicate top-to-bottom extensions work places events in a queue, and the Callbacks of pending events are executed sequentially by the Event 1In addition, Python 3.4 introduced native EDA support. 2 like Node.cure. An application’s JavaScript code is ex- ity. The attacker knows how to exploit this vulnerability: ecuted using Google’s V8 JavaScript engine [3], its pro- they know the victim feeds user input to a Vulnerable grammability is supported by Node.js core JavaScript API, and they know evil input that will cause the Vulner- modules with C++ bindings for system calls, and the able API to block the Event Handler executing it.

A Sense of Time for Javascript and Node.Js: First-Class Timeouts As a Cure for Event Handler Poisoning

Elastic Storage for Linux on System Z IBM Research & Development Germany

IBM Spectrum Scale CSI Driver for Container Persistent Storage

File Systems and Storage

Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO

A Sense of Time for Node.Js: Timeouts As a Cure for Event Handler Poisoning

Shared File Systems: Determining the Best Choice for Your Distributed SAS® Foundation Applications Margaret Crevar, SAS Institute Inc., Cary, NC

Intel® Solutions for Lustre* Software SC16 Technical Training

Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques

Zfs-Ascalabledistributedfilesystemusingobjectdisks

ZFS Basics Various ZFS RAID Lebels = & . = a Little About Zetabyte File System ( ZFS / Openzfs)

Lessons Learned from Using Samba in IBM Spectrum Scale

Implementing High Availability and Disaster Recovery Solutions with SAP HANA on IBM Power Systems