poisoning the event-driven architecture: a full-stack look at...

14
Poisoning the Event-Driven Architecture: A Full-Stack Look at Node.js and Friends Anonymous Author(s) Abstract In recent years, both industry and the open-source community have turned to the Event-Driven Architecture (EDA) to provide more scalable web services. Though the Event-Driven Architecture can offer better scalability than the One Thread Per Client Architecture, its use can leave service providers vulnerable to a Denial of Service attack that we call Event Handler Poisoning (EHP). In this work we define EHP attacks and examine the ways in which EDA frameworks expose applications to this risk. We also touch on application-level EHP vulnerabilities. We focus on three popular EDA frameworks: Node.js (JavaScript), Twisted (Python), and libuv (C/C++). Our analysis studies both CPU-bound and I/O-bound approaches to Event Handler Poisoning. For CPU-bound attacks we focus on Regular Expression Denial of Service (ReDoS). For I/O-bound ap- proaches we examine the weaknesses of asynchronous I/O imple- mentations in these frameworks. Using our Constant Worst-Case Execution Time pattern, EDA- based services can become invulnerable to EHP attacks. Guided by this pattern, in our system node.cure we implement two framework- level defenses for Node.js that serve as effective antidotes to the concerns we raise. CCS Concepts Security and privacy Denial-of-service attacks; Software security engineering; Web application security; Keywords Event-Driven Architecture; Denial of Service; Node.js; Algorithmic Complexity; ReDoS 1 Introduction Web servers are the lifeblood of the modern Internet. To handle the demands of millions of clients, service providers replicate their services across many servers. Since each additional server costs time and money, service providers want to maximize the number of clients each server can handle. Over the past decade, this goal has led the software community to seriously consider a paradigm shift in their software architecture — from the One Thread Per Client Architecture (OTPCA) to the Event-Driven Architecture (EDA) (also known as the reactor pattern). Though using the EDA may increase the number of clients each server can handle, it also exposes the service to a class of Denial of Service (DoS) attack unique to the EDA: Event Handler Poisoning (EHP). Traditionally, web services have been implemented using the One Thread Per Client Architecture (OTPCA), and support more clients by increasing the number of threads. In practice the overhead Anonymous Submission to ACM CCS 2017, Due 19 May 2017, Dallas, Texas 2017. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00 https://doi.org/10.1145/nnnnnnn.nnnnnnn induced by each additional thread artificially limits the number of clients each server can handle, requiring the service provider to op- erate more servers than is strictly necessary. The EDA significantly reduces the overhead paid for each additional client, multiplying the number of clients each server can handle. Figure 1 gives some intuition about how the EDA achieves its low per-client overhead. Rather than creating a new thread for each request, which introduces overheads in time (context-switching) and space (per-thread stack), the EDA simply stores the request as an Event in a queue to be handled by a fixed set of Event Handlers. In other words, the EDA achieves scalability using thread multiplexing. By multiplexing these unrelated computations onto the same Event Handlers, the EDA amortizes the space and time overhead of each additional client. Since each client costs less in the EDA, EDA-based services can handle a larger number of clients concurrently. Figure 1: This is the Event-Driven Architecture. Incoming events are handled sequentially by the event loop. The Event Loop may generate new events in the form of tasks it of- floads to the Worker Pool. We refer to the event loop and the workers in the worker pool as Event Handlers. Alas, the scalability promised by the EDA comes at a price. Mul- tiplexing unrelated work onto the same thread(s) may reduce over- head, but it also moves the burden of time sharing out of the thread library or operating system and into the application itself. Where OTPCA-based services can rely on the preemptive multitasking promised by the underlying system to ensure that resources are shared fairly, using the EDA requires the service to enforce its own cooperative multitasking [71]. In other words, thread multiplex- ing destroys isolation. For example, if a client’s request takes a long time to handle in an OTPCA-based service, preemptive mul- titasking will ensure that other requests are handled fairly. But if this same lengthy request is submitted to an EDA-based service, and if the service incorrectly implements cooperative multitasking, then this request will deny service to all other clients because they cannot use the Event Handler that is processing the request. We call this state of affairs Event Handler Poisoning. We present a general principle, borrowed from the real-time systems community, that EDA-based services can use to defend 1

Upload: others

Post on 02-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Poisoning the Event-Driven Architecture:A Full-Stack Look at Node.js and Friends

Anonymous Author(s)

AbstractIn recent years, both industry and the open-source community haveturned to the Event-Driven Architecture (EDA) to provide morescalable web services. Though the Event-Driven Architecture canoffer better scalability than the One Thread Per Client Architecture,its use can leave service providers vulnerable to a Denial of Serviceattack that we call Event Handler Poisoning (EHP).

In this work we define EHP attacks and examine the ways inwhich EDA frameworks expose applications to this risk. We alsotouch on application-level EHP vulnerabilities. We focus on threepopular EDA frameworks: Node.js (JavaScript), Twisted (Python),and libuv (C/C++).

Our analysis studies both CPU-bound and I/O-bound approachesto Event Handler Poisoning. For CPU-bound attacks we focus onRegular Expression Denial of Service (ReDoS). For I/O-bound ap-proaches we examine the weaknesses of asynchronous I/O imple-mentations in these frameworks.

Using our Constant Worst-Case Execution Time pattern, EDA-based services can become invulnerable to EHP attacks. Guided bythis pattern, in our system node.curewe implement two framework-level defenses for Node.js that serve as effective antidotes to theconcerns we raise.

CCS Concepts• Security and privacy→Denial-of-service attacks; Softwaresecurity engineering; Web application security;

KeywordsEvent-Driven Architecture; Denial of Service; Node.js; AlgorithmicComplexity; ReDoS

1 IntroductionWeb servers are the lifeblood of the modern Internet. To handlethe demands of millions of clients, service providers replicate theirservices across many servers. Since each additional server coststime and money, service providers want to maximize the number ofclients each server can handle. Over the past decade, this goal hasled the software community to seriously consider a paradigm shiftin their software architecture — from the One Thread Per ClientArchitecture (OTPCA) to the Event-Driven Architecture (EDA) (alsoknown as the reactor pattern). Though using the EDA may increasethe number of clients each server can handle, it also exposes theservice to a class of Denial of Service (DoS) attack unique to theEDA: Event Handler Poisoning (EHP).

Traditionally, web services have been implemented using theOne Thread Per Client Architecture (OTPCA), and support moreclients by increasing the number of threads. In practice the overhead

Anonymous Submission to ACM CCS 2017, Due 19 May 2017, Dallas, Texas2017. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00https://doi.org/10.1145/nnnnnnn.nnnnnnn

induced by each additional thread artificially limits the number ofclients each server can handle, requiring the service provider to op-erate more servers than is strictly necessary. The EDA significantlyreduces the overhead paid for each additional client, multiplyingthe number of clients each server can handle.

Figure 1 gives some intuition about how the EDA achieves itslow per-client overhead. Rather than creating a new thread for eachrequest, which introduces overheads in time (context-switching)and space (per-thread stack), the EDA simply stores the request asan Event in a queue to be handled by a fixed set of Event Handlers. Inother words, the EDA achieves scalability using thread multiplexing.By multiplexing these unrelated computations onto the same EventHandlers, the EDA amortizes the space and time overhead of eachadditional client. Since each client costs less in the EDA, EDA-basedservices can handle a larger number of clients concurrently.

Figure 1: This is the Event-Driven Architecture. Incomingevents are handled sequentially by the event loop. The EventLoop may generate new events in the form of tasks it of-floads to the Worker Pool. We refer to the event loop andthe workers in the worker pool as Event Handlers.

Alas, the scalability promised by the EDA comes at a price. Mul-tiplexing unrelated work onto the same thread(s) may reduce over-head, but it also moves the burden of time sharing out of the threadlibrary or operating system and into the application itself. WhereOTPCA-based services can rely on the preemptive multitaskingpromised by the underlying system to ensure that resources areshared fairly, using the EDA requires the service to enforce its owncooperative multitasking [71]. In other words, thread multiplex-ing destroys isolation. For example, if a client’s request takes along time to handle in an OTPCA-based service, preemptive mul-titasking will ensure that other requests are handled fairly. But ifthis same lengthy request is submitted to an EDA-based service,and if the service incorrectly implements cooperative multitasking,then this request will deny service to all other clients because theycannot use the Event Handler that is processing the request. Wecall this state of affairs Event Handler Poisoning.

We present a general principle, borrowed from the real-timesystems community, that EDA-based services can use to defend

1

Page 2: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

against EHP. As EHP attacks are possible when one request to anEDA-based server doesn’t cooperatively yield to other requests,our antidote is simple: the server must ensure that every stageof the request handling takes no more than constant time beforeit yields to other requests. In applying this principle, EDA-basedservers extend input sanitization from the realm of data into therealm of the algorithms used to process it. We evaluate severalpopular EDA frameworks in light of this principle, identifying EHPvulnerabilities in each. Our node.cure prototype extends Node.js todefend against the vulnerabilities we identified there.

This paper is an extension of a workshop paper [28]. There weintroduced the idea of EHP attacks, argued that some previously-reported security vulnerabilities were actually EHP vulnerabilities,and gave a sketch of our antidote. Here we offer the completepackage: a refined definition of EHP, a general principle for defense,examples of vulnerabilities in the wild, and effective framework-level defenses.

In summary, this paper makes the following contributions:

(1) We are the first to define Event Handler Poisoning, an attack thatexploits a general vulnerability in the EDA (§3). Our definitionof EHP is general, and captures attacks targeting either theevent loop or the worker pool.

(2) Using concepts borrowed from the real-time systems commu-nity, we describe a general antidote for Event Handler Poisoning(§4). Our antidote gives developers precise guidance, replacingthe dangerous heuristics currently offered in the EDA commu-nity.

(3) We identify potential EHP vulnerabilities in software built usingthe popular Node.js EDA framework (§5). In doing so we extendthe state of the art in ReDoS detection.

(4) Informed by our EHP antidote, we identify weaknesses inNode.js itself that expose its community to EHP. In node.cure

we implement effective solutions to these problems, defendingNode.js applications against two forms of EHP: ReDoS (§5.2)and Large/Slow I/O (§6.3).

2 BackgroundIn this section we contrast the EDA with the OTPCA (§2.1). Wethen explain our choice of EDA frameworks for study (§2.2), andfinish by explaining formative work on Algorithmic Complexity(AC) attacks (§2.3).

2.1 The OTPCA and the EDAThere are two main architectures for scalable web servers, the OneThread Per Client Architecture and the Event-Driven Architecture.The traditional approach is the OTPCA, of which Apache [2] isthe most prominent. In the OTPCA style, a thread or a process isassigned to each client, with a large maximum number of concur-rent clients. The OTPCA model isolates clients from one another,reducing the risk of interference. However, each additional clientincurs memory and context switching overhead. In contrast, bymultiplexing multiple client connections into a single thread andoffering a small worker pool for asynchronous tasks, an EDA ap-proach significantly reduces a server’s per-client resource use. Butthis design also reduces isolation between clients, exposing theserver to vulnerabilities like EHP.

Although EDA approaches have been discussed and implementedin the academic and professional communities for decades (e.g. [30,42, 48, 69, 74, 77]), historically EDA has only seen wide adoption inuser interface settings like web browsers. How things have changed.The EDA is increasingly relevant in systems programming todaythanks to the explosion of interest in Node.js [12], an event-drivenserver-side JavaScript framework. Although the OTPCA and EDAarchitectures have the same expressive power [55], in practice EDAimplementations have been found to offer greater scaling [67].

Several variations on the server-side EDA have been proposed:single-process event-driven (SPED), asymmetricmulti-process event-driven (AMPED), and scalable event-driven (SEDA). The AMPEDarchitecture is depicted in Figure 1, while in SPED there is noworkerpool [66]. In the SEDA architecture, applications are designed as apipeline composed of stages, each of which resembles Figure 1 [77].

Figure 1 illustrates the widely used AMPED EDA formulation,which is common to all mainstream general-purpose EDA frame-works today. In the AMPED EDA (hereafter simply “the EDA”)the operating system or a framework places events in a queue (e.g.through Linux’s poll system call1), and pending events are handledby an event loop. The event loop may offload expensive activity toa small, fixed-size worker pool, which in turn generates an eventto inform the event loop when an offloaded task completes. Theworker pool is implemented using threads or separate processes,and offers services like “asynchronous” (blocking but concurrent)access to the file system or DNS.

In applications built using EDA, there is a set of possible Events(e.g. “new network traffic”) for which the developer supplies a set ofCallbacks [46]. During the execution of an EDA-based application,events are generated by external activity or internal processing androuted to the appropriate Callbacks. Each Callback is executed onone of the Event Handlers; synchronous Callbacks are executed onthe event loop, asynchronous Callbacks on a worker.

2.2 Popular EDA FrameworksAn EDA approach can be (and has been) implemented in most pro-gramming languages. Table 1 lists the dominant2 general-purposeEDA framework for a range of languages: Node.js (JavaScript) [12],Twisted (Python) [19], libuv (C/C++) [9], Reactor (Java) [16], Event-Machine (Ruby) [5], Zotonic (Erlang) [21], and Microsoft’s P# [41].To capture the ongoing investment in each framework, we surveyedits GitHub [6] repository for the number of commits, contributors,and releases, with results shown in Table 1.

Guided by Table 1, we selected Node.js (JavaScript), Twisted(Python), and libuv (C/C++) for closer investigation, weighting ourselection by the number of contributors (as a proxy for communityinterest) and the number of commits (as a proxy for maturity).

Node.js is a server-side event-driven framework for JavaScriptthat uses the architecture shown in Figure 1. Theworker pool is usedinternally to perform asynchronous DNS queries and file system I/O.In addition, user-defined code can be offloaded to the worker pool.Born in 2009, Node.js now claims 3.5 million users [27], and its open-source package ecosystem, npm, is the largest of any programming1Per the Linux man page, “poll...waits for one of a set of file descriptors to becomeready to perform I/O.”2We identified the dominant framework for each language through a tournamentapproach, using web searches to identify the frameworks for each language, and thesame metrics as in Table 1 to choose the most dominant among them.

2

Page 3: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

Framework Contributors Commits ReleasesNode.js (JavaScript) 1344 17355 426Twisted (Python) 52 21897 39libuv (C/C++) 255 3725 185Reactor (Java) 56 4249 42EventMachine (Ruby) 118 1153 29Zotonic (Erlang) 60 6632 69P# 10 1542 2

Table 1: Maturity of event-driven programming environ-ments measured by development activity. Data were ob-tained from GitHub on May 6, 2016.

language or framework [10]. Node.js has seen adoption in industry,at companies including LinkedIn [63], PayPal [50], eBay [65], andIBM [11].

Twisted is an event-driven networking engine written in Pythonthat uses the architecture shown in Figure 1. During its 14 years,Twisted has seen use in significant projects like Ubuntu One [26],Apple’s Calendar and Contacts Server [22], and the SageMath Note-book [24].

C/C++ developers can use the libuv framework to write event-driven programs. Node.js’s event-driven runtime relies on libuv, soboth frameworks use the same architecture. Though libuv was firstdeveloped in 2011 to support Node.js, it has since been adopted byother frameworks and languages. Examples include the numericalcomputing language Julia [23], early versions of Mozilla’s systemsprogramming language Rust [25], and others [14].

2.3 Algorithmic Complexity AttacksOur work was inspired by and generalizes the idea of AlgorithmicComplexity (AC) attacks [37, 60]. In an AC attack, attackers forcevictims to take longer than expected to perform an operation. Ratherthan overwhelming the victim with volume, the attacker turns thevictim’s vulnerable algorithms against him. AC vulnerabilities cantherefore be exploited to launch DoS attacks, when the attackerforces the victim to spend a long time processing his request at theexpense of legitimate clients, or when the attacker increases thecost even of legitimate requests.

Regular expression Denial of Service (ReDoS) attacks are aclass of AC attack that has gained attention in recent years [54, 68,70, 73]. Developers often use regular expressions for input sani-tization, and must produce regular expressions that correctly dif-ferentiate between valid and invalid input. When the set of validinputs is varied, the resulting regular expressions can become com-plex. Unfortunately, a poor choice of regular expression exposes anapplication to ReDoS.

Though regular expression engines perform well in general,many popular engines (Perl, Python, V8 JavaScript, etc.) have worst-case exponential-time performance. An ReDoS attack convinces avulnerable regular expression to evaluate evil input that trig-gers this worst-case exponential running time. Such an evaluationwill run effectively forever, at least until it is canceled or timedout. This worst-case exponential-time performance occurs becauseproduction-grade “regular expression” engines are misnamed —they support Context-Free Language extensions to traditional regu-lar expressions, and these extensions may in the worst case requireexponential time.

3 Event Handler Poisoning AttacksServer-side applications implemented using the EDA are vulnerableto a unique type of denial of service attack: Event Handler Poisoning.Key to the EDA is the multiplexing of many client requests ontoa small set of Event Handler threads. An Event Handler Poisoningattack exploits this design by poisoning one or more Event Handlers,causing them to block, perhaps indefinitely. This is done throughthe submission of evil input, an expensive task that is either CPU-bound or I/O-bound.

AC-style vulnerabilities are thus a form of EHP: if an attackeridentifies an AC-style vulnerability in an EDA-based server, hecan submit “evil input” to trigger the WCET of the vulnerablealgorithm. When an Event Handler picks up the request, it will takean inordinately long time to handle it, starving subsequent requests— an EHP attack. EHP attacks redirect the energies of the victimrather than trying to cause it to crash.

We take issue with the prevailing wisdom in the EDA community,which is that offloading tasks to the worker pool is a panacea forexpensive work. Suppose an attacker identifies an EHP vulnerabilityin code executed by the worker pool. A pool of size k will jam afteronly k poisonous requests; the EHP vulnerability is essentially thesamewhether the attack is against the event loop or theworker pool.To induce complete DoS, an attacker’s aim is to submit requeststhat poison either the event loop or each worker in the worker pool,thereby delaying the handling of subsequent requests.

We note that an attack targeting a worker pool can in principlebe carried out against any architecture or system with a cap on theamount of resources available for clients, including both EDA andpool-based OTPCA designs. However, the pool in a OTPCA systemis orders of magnitude larger than the worker pool in typical EDAimplementations. While Node.js has a default worker pool of size 4with a maximum of 128 workers [13] , Apache’s httpd server has adefault connection pool of size 256-400, with a maximum size ofaround 20000 [1] . Due to the size of an OTPCA server’s connectionpool, attempts to jam it would presumably be detected by an IDS oraddressed through rate limiting. By comparison, the small numberof requests needed to jam a vulnerable EDA server’s worker poolis too few to draw attention from network-level defenses, placingthe burden of defense on the application developer or the runtimesystem to address it.

An important facet of an EHP attack is its asymmetry. As in aDistributed DoS (DDoS) attack, a carefully crafted “evil request”will be roughly the same size and shape as the requests submittedby legitimate clients. However, unlike a DDoS attack, the attackerneed only submit one (or k) rather than a flood. This significantlyreduces the resources required by the attacker, and makes it difficultfor a network-level IDS system to detect an attack.

Figure 2 illustrates the work patterns available in the EDA. (a-c)are vulnerable to EHP, while (d) and (e) are safe (§4).

3.1 EHP Threat ModelOur threat model is straightforward: the attacker crafts “evil input”,and convinces the EDA-based victim to feed it to a vulnerableAPI. This model is reasonable because our victim is an EDA-basedserver ; servers exist solely to process client input using APIs. If theserver uses a vulnerable API, and if it invokes it with client input,

3

Page 4: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

Figure 2: This figure illustrates various EDA work patterns,drawing on material from §3 and §4. Figures (a-c) are vul-nerable to EHP, while (d) and (e) practice our antidote (§4).(a) shows the event loop performing an expensive opera-tion synchronously. (b) shows the event loop offloading thatoperation to the worker pool, where it is performed syn-chronously by a worker. (c) shows the event loop offloadingthe partitioned operation to the worker pool, but note thatone of the partitions is not C-WCET. (d) shows a C-WCETpartitioning of the operation, with each partition offloadedto the worker pool. In (e) the server offloads a request to anexternal service.

the server can be poisoned. We identify vulnerable APIs in EDAframeworks in sections 5 and 6, and elaborate on this model there.

3.2 Defining CPU-bound VulnerabilitiesThe CPU-bound EHP vulnerabilities in an EDA system are a form ofAlgorithmic Complexity (AC) vulnerability (§2.3). In an AC attack,attackers force victims to take longer than expected to perform anoperation, typically by exploiting the Worst-Case Execution Time(WCET) of an algorithm used to process requests. If an attackeridentifies an AC-style vulnerability in an EDA-based server, he cansubmit evil input to trigger the WCET of the vulnerable algorithm.When an Event Handler picks up the request, it will take an inor-dinately long time to handle it, starving subsequent requests – anEHP attack.

CPU-bound EHP vulnerabilities in the EDA are more generalthan Crosby and Wallach’s formulation of AC attacks. According toCrosby andWallach [37], the victim of an AC attack is an algorithmor data structure that performs well in the average case but poorlyon carefully chosen “evil input”. In the EDA setting, however, anyalgorithmwith non-constantWCET (non-C-WCET) and uncontrolledinput constitutes an EHP vulnerability (§3).

While the EDA was originally intended for I/O-bound applica-tions, we argue that the need to perform computation in an EDAsystem is growing. Full-featured servers that rely on the EDA mustinevitably deal with the computations needed to implement busi-ness logic, and with the advent of fog computing [32], embedded IoTdevices will increasingly be called on to perform data processing.As these input-driven devices are most readily programmed using

the EDA model, as shown in Cylon.js [4], processing the incomingdata will require computation within an EDA setting.

3.3 Defining I/O-bound VulnerabilitiesWeb servers are commonly called on to interact with the file sys-tem. An I/O-bound EHP vulnerability exists when the attacker canrequest the server make a particularly slow I/O. For example, ina file server, a slow I/O might be a read of an unusual file, whichcan take on average 8x longer to serve [80], or perhaps data storedon slow media, e.g. a network drive. We suggest several ways inwhich an attacker might exploit an I/O-bound EHP in Section 6.

Gaps between theory and practice aside, we have argued thatin the EDA the risk of DoS is more general than just the eventloop: EHP attacks can be launched against either the event threador the worker pool. When practitioners incautiously offload tasksto the worker pool, they are guilty of conceptual abuse. In manyEDA frameworks the worker pool is small and fixed-size – thereare far fewer worker threads than clients per server. This meansthat cavalierly offloading potentially long-running tasks to theworker pool is a reversion from the EDA to the OTPCA. In the EDA,developers must take care even in the tasks they offload to the workerpool.

4 AnAntidote for Event Handler Poisoning

When an algorithm’s overall complexity is notO(1), being C-WCET-partitioned gives it two properties that make it ideal for the EDAsetting. First, it will be cooperative; it will not be executed atomically,but will instead allow other events to have a turn on the EventHandler. Second, it will be predictable: other events will have a turnon the Event Handler within a fixed amount of time, necessaryfor high throughput in a context like Node.js (where events ofdifferent types and costs are handled by a single worker pool). AC-WCET-partitioned algorithm can block neither the event threadnor the worker pool, and thus an application composed of C-WCET-partitioned algorithms is invulnerable to EHP attacks. To the bestof our knowledge we are the first to propose this hardline stanceon C-WCET partitioning for EDA applications.

Partitioning an algorithm to reduce each partition’sWCET is alsorecommended by Wandschneider [75], but he suggests partitionswith O(n)-WCET. Wandschneider’s approach is unsound if thecost for each element is non-trivial. True C-WCET partitioningis necessary to make an application invulnerable. And once analgorithm has been partitioned to O(n) −WCET , partitioning itfurther to O(1) −WCET should not be challenging.

C-WCET partitioning is essentially the logical extreme of co-operative multitasking [71], and interprets EDA systems as real-time (embedded) systems with a need for strict forward progressguarantees [78]. C-WCET partitioning also makes predictable theworst-case overall completion time of a request, a useful featurefor application designers. In §5 and §6 we discuss the risk of EHPwhen non-C-WCET-partitioned algorithms are used in the EDA.

We argue that developers should view C-WCET partition-ing as input sanitization. In the EDA, input sanitization mustextend from the realm of data into the realm of the algorithms usedto process it. In the EDA, C-WCET partitioning is as necessary aschecking user input for directory traversal attacks or code injection.

4

Page 5: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

In many cases EHP vulnerabilities are the responsibility of theapplication developer. If the application performs expensive workon the event loop or offloads enormous, unpartitioned tasks tothe worker pool, we cannot help except on a case-by-case basis.Where we can help more generally, however, are cases where theapplication makes a reasonable use of the framework APIs, butthe implementation of these APIs disappoints. Using Node.js asa case study, we identify and repair two framework APIs whoseinfelicitous implementations expose any Node.js applications usingthem to EHP attacks. In §5 we identify and address a solution toa CPU-bound API, and in §6 we do the same for an I/O-boundAPI. By applying C-WCET techniques at the framework level, ournode.cure prototype offers the Node.js community a high-impact,easy-to-adopt solution to some EHP attacks.

5 CPU-boundEHPVulnerabilities in theEDA

In this section we discuss CPU-bound EHP vulnerabilities. Wefocus our attention on ReDoS, describing existing vulnerabilities innpm (§5.1) and demonstrating a defense that incorporates a hybridregular expression engine into Node.js (§5.2). We close with anexploratory manual investigation of 80 npm modules, showing thatthe Node.js community’s practices are vulnerable to EHP becausethey do not follow our C-WCET partitioning approach (§5.3).

5.1 ReDoS VulnerabilitiesAfter explaining what ReDoS vulnerabilities are and describing theassociated threat model, this section answers two questions:(1) How can we identify a vulnerable regular expression (§5.1.1)?

If developers cannot do this accurately, their applications arelikely to remain vulnerable to ReDoS.

(2) Are any popular npm modules vulnerable to ReDoS (§5.1.2)?While our discovery of ReDoS vulnerabilities as we answered

the second question motivated our solution (discussed in §5.2),answering the first question informed its design.

We have documented and automated each step of the processoutlined below. Our programs and the resulting data are availableat BLINDED for those interested in reproducing our results orapplying our techniques to other questions.

As discussed in §2.3, ReDoS vulnerabilities require the serverto evaluate a vulnerable regular expression on evil input designedto trigger the worst-case performance of its regular expressionengine. This worst-case performance is exponential in the lengthof the input, so for long input or fast blow-up other clients will bedenied computational resources and thus service. Common regularexpression engines evaluate inputs against patterns synchronously,so in the EDA an ReDoS attack is depicted in Figure 2 (a) or (b).

ReDoS Threat Model We further refine the EHP threat model(§3.1) for ReDoS. Regular expressions are commonly used as a firstline of defense when sanitizing client input. If a client identifiesa vulnerable regular expression being used by the server, he cancompose evil input that triggers the worst-case exponential-timeperformance of the regular expression engine. He might identifythis regular expression by examining regular expressions used tovalidate input on the client side of this server, or by examiningsource code or regular expression libraries. In the EDA, regularexpressions are typically executed synchronously on the event loop,

so ReDoS attacks are EHP attacks that poison this event handler.The server can stymie an ReDoS attack by avoiding the use ofvulnerable patterns, by rejecting the evil input before evaluatingthe regular expression on it, or by using a regular expression enginewith a timeout mechanism.

To obtain evil input, the attacker must derive from the regu-lar expression three strings: the prefix, the pump, and the suffix.Since backtracking behavior is induced by requiring the regularexpression to try exponentially many paths through the input be-fore declaring a mismatch, the attacker uses the prefix to reach thevulnerable point in the regular expression, repeats (or “pumps”) thepump string to repeatedly double the size of the tree explored bythe engine, and adds the suffix to cause a mismatch and trigger thebacktracking.

5.1.1 Identifying Vulnerable RegexesA seemingly straightforward defense against ReDoS is to avoid theuse of vulnerable regular expressions. But how can a developer de-termine whether a regular expression is vulnerable? In this sectionwe demonstrate that while there are open-source tools that performthis classification, they do not do so accurately.

We identified two open-source tools that claim to answer thisquestion: safe-regex [17] and rxxr2 [68]3. We refer to these toolsas our ReDoS oracles. safe-regex labels as vulnerable all regularexpressions with star height [51] greater than one (i.e. nested quan-tifiers), while rxxr2 performs a more complex analysis based on anidealized backtracking regular expression engine.

We next located a source of “ground truth” with which to evalu-ate these tools. Security vulnerabilities in npm modules are publi-cized by the Node Security Platform (NSP) 4 and Snyk.io [18]. Theseorganizations report on a range of vulnerabilities including ReDoS.As reported in [28], we extracted 43 potentially-vulnerable regu-lar expressions from ReDoS vulnerabilities reported by NSP andSnyk.io as of February 1, 2017, and then applied both safe-regex

and rxxr2 to evaluate each regular expression for vulnerability.Untrustworthy OraclesWe found that neither safe-regex nor

rxxr2 is trustworthy, either alone or when used in combination. Byassessing the decision of each oracle for accuracy, we determinedthe kinds of vulnerable regular expressions each was good and badat identifying.

We evaluated each of the 43 potentially-vulnerable regular ex-pressions from NSP and Snyk.io for true vulnerability using threepopular backtracking engines, those built into Perl, Python, andNode.js. We defined a regular expression as vulnerable in an engineif the engine takes more than 500 ms to evaluate an input thathas been “pumped” at most 20,000 times. This practical means ofevaluation gives us strong evidence for true vulnerabilities, but itnecessarily filters out vulnerabilities that may only manifest in anengine on (absurdly) long input. Note that due to differing engineimplementations, expressions may be vulnerable on some enginesand safe on others.

Here is a description of the result of this study:(1) rxxr2 reported no false positives – it was correct for every reg-

ular expression it labeled vulnerable. Since rxxr2 recommends

3We are aware of one other potential ReDoS oracle, Microsoft’s SDL Regex Fuzzer [72].However, this tool is no longer available.4See https://nodesecurity.io/

5

Page 6: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

evil input, we were able to use this input to exploit each of thevulnerable regular expressions it reported.

(2) safe-regex reports false positives – regular expressions thatcould in principle be vulnerable but which are not in any of thebacktracking engines we evaluated. In our dataset, these falsepositives came in two forms: non-dangerous quantifiers andsmall specific quantifiers. The use of the “optional” quantifier‘?’, as in /(X+)?$/, does not seem to be problematic for theseengines. Regular expressions containing small specific quan-tifiers, like /(X2)+$/, are handled quickly up to very longinput.

(3) safe-regex reports true positives that rxxr2 does not. It is notclear whether rxxr2’s unsoundness comes from rxxr2’s designor its implementation.

(4) rxxr2 reports true positives that safe-regex does not. Thesetake two forms: nested quantifiers with an OR’d inner expres-sion, and quantified expressions with overlapping OR’d classes.First, safe-regex does not correctly detect regular expressionsof the form /(X+|Y)+$/, though this is clearly an example of astar height greater than one5. Second, it misses regular expres-sions with quantified overlapping OR’d classes, like this one:/(X|X)+$/. This regular expression is vulnerable because oninput of the form “X...X” that does not match, the engine willbacktrack to try each possible division of the X’s between thefirst and second OR’d classes – there are exponentially manychoices in the length of the input. Note that this vulnerabilityis not due to the star height of the expression; the backtrackingis incurred using other means.

(5) Neither safe-regex nor rxxr2 identifies another form of vulner-able regular expression: adjacent quantified expressions withoverlapping contents, like this one: /\d+\d+$/. This class ofexpressions is vulnerable for essentially the same reason as thepreviously-discussed vulnerability. Curiously, safe-regex’s testsuite includes regular expressions of this form in its list of saferegular expressions.

(6) Though there was only one example in this dataset, neithersafe-regex nor rxxr2 seems to identify regular expressions con-taining exponential-time features. This inaccuracy is partic-ularly ironic because backreferences and lookahead are theonly components of standard regular expression languagesthat actually must require exponential-time evaluation in theworst case [36]. Though backreferences are outside the scopeof safe-regex’s investigations, rxxr2’s authors claim it shouldconsider them [68]. On the other hand, rxxr2’s implementationdoes not support regular expressions with lookahead. Table 2contains an assortment of vulnerable backreference-using reg-ular expressions that safe-regex and rxxr2 incorrectly labelsafe.

The most notable weakness of safe-regex and rxxr2 is that theydo not consider exponential-time regular expression features. Tominimize the false negatives reported by our ReDoS oracles, wehave developed a novel oracle has-exp-features that identifies reg-ular expressions that use exponential-time features and labels them

5This is a bug in safe-regex that we have patched. See our pull request atBLINDED . We subsequently refer only to the patched version of safe-regex.

Regular expression Vulnerable in which engine?/android.+(\w+)\s+build\1$/ Node.js, Python

/(\w+)\1$/ Node.js, perlTable 2: These regular expressions contain backreferences,and are vulnerable in the indicated engines. The first regu-lar expression in the table is used in the ua-parser-js npmmodule. The second regular expression is a simplified ver-sion emphasizing the source of the vulnerability. For “evilinput” to the final expression, we used an empty prefix, thepump string “a”, and a suffix of “!” to cause a mismatch andtrigger backtracking.

vulnerable. has-exp-features thus has no false negatives but manyfalse positives.

Armed with three ReDoS oracles, two existing and one novel,we searched for true ReDoS vulnerabilities in the wild.

5.1.2 ReDoS Vulnerabilities in npm ModulesIn this section we discuss the process by which we identified 27,808regular expressions used in the wild, and elaborate on how weidentified 55 as truly vulnerable and an additional 101 as probablyvulnerable.

Choosing The Software to Study To identify ReDoS vulner-abilities, we first had to identify regular expressions used in realsoftware. To ensure a fair evaluation, we restricted our definition of“real software” to emphasize well-maintained software. We decidedto investigate software in the Node.js package ecosystem, npm,since it is open-source and has a large and active developer com-munity. npm as an ecosystem also remains relatively unexplored.We searched the npm ecosystem for packages with relatively largenumbers of commits (as a proxy for being well-maintained) andrelatively large monthly download rates to maximize the impact ofour findings. Here are the steps we took:(1) We cloned the npm registry on May 3, 2017. This registry con-

tains the metadata for each module defined in the npm ecosys-tem, including the location of its source code repository. Theregistry had information on 447,362 modules.

(2) We filtered the registry for the 348,691 modules whose code ishosted on GitHub, which has an easy-to-query API from whichwe could extract commit numbers. We excluded from this setan additional 78,890 modules which had no monthly downloadsor for which we did not identify at least one commit (e.g. theGitHub repository no longer existed). This left us with 269,801modules.

(3) For each survivingmodule, we obtainedmaintenance data (com-mits) from GitHub and popularity data (monthly downloads)from npm. We only counted the commits made by the project’stop 30 contributors due to GitHub’s rate limiting policy.

We visualized this data in Figure 3, which shows the frequencyof different <commits, downloads> values on a log-log scale. On therainbow scale, the darkest red hexagons have the highest frequency,the deepest violet the lowest. The red lines indicate two standarddeviations above the arithmetic mean in each dimension. Using thisanalysis, we were able to select the “best maintained, most popular”modules – those in the upper-right quadrant, 1,591 in all. See [79]for other work related to the npm ecosystem.

6

Page 7: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

Log_10 Total Contributions

Lo

g_

10

Mo

nth

ly D

ow

nlo

ad

s

2

4

6

8

0 1 2 3 4 5

Counts

1

448

894

1341

1788

2235

2682

3128

3575

4022

4468

4915

5362

5809

6256

6702

7149

Figure 3: This figure shows the density of the downloads vs.contributions chart. Redder colors are the most dense, somany modules have no more than 10 contributions and 100downloads per month. The red lines plot two standard devi-ations above the arithmeticmean on each axis, and intersectat 270 commits and 3500 downloads per month. We label as“well maintained and popular” those modules in the upper-right quadrant. In clockwise order from the lower-left, thequadrants contain 250,164, 6,602, 1,591, and 11,444 modules.

Extracting Regular Expressions From npm Modules Hav-ing obtained a dataset of popular, highly-downloaded npmmodules,our next step was to extract the regular expressions they used. Asthere are well-known difficulties inherent in statically analyzing adynamic language like JavaScript, we opted for a simpler dynamicapproach. To this end, we instrumented Node.js v6.10.36 to emitregular expressions as they were declared.

To determine the regular expressions used by our dataset of npmmodules, we installed each of them and ran its test suite. Thoughwe don’t expect the test suites to offer complete code coverage,regular expressions are typically used to sanitize input, and thuswe expect a reasonable test suite to exercise most if not all of theregular expressions used by the module.

We were able to successfully run the test suites of 1,182 of our1,591 chosen modules7, capturing 2,282,330 regular expression dec-larations, which we reduced to a set of 27,808 unique patterns. Thelarge number of repeated regular expressions was likely due to ouruse of dynamic analysis; we captured regular expressions declaredanywhere in the running test suite, whether in a module’s code orin the code of one of its dependencies. For example, if modules Aand B both depend on module C , and C uses a regular expression,this regular expression would appear twice in our unfiltered set.

6We applied our changes to Node.js commit a0bfd4f5 (5 May 2017)7Of the remainder, some test suites would not run, and others timed out under ourfive minute cap.

Vulnerable Regular Expressions Having identified a set of27,808 real regular expressions from well-maintained modules, weasked our ReDoS oracles about each of them. Figure 4 illustratesthe varying opinions the oracles gave on the 4,113 regular expres-sions reported “vulnerable” by at least one oracle. We used rxxr2’srecommendations for evil input to identify 55 as truly vulnerable.As rxxr2’s recommendation format is ill-defined, we were unableto automatically reach conclusions on the remaining 101 regularexpressions it flagged as vulnerable. Since neither safe-regex norhas-exp-features recommends evil input, we were unable to vali-date these dynamically. The inaccuracies of the oracles discussedin §5.1.1, however, suggests that there are likely truly vulnerableregular expressions in each area of the figure. The development ofimproved ReDoS oracles remains an area for future work.

Figure 4: This figure shows the overlap between the deci-sions of our ReDoS oracles on our set of wild regular ex-pressions. We asked each oracle about the 27,808 regular ex-pressions in our dataset, and 4,113 were labeled vulnerableby at least one oracle. rxxr2 (RX) was the most conservative,while safe-regex (SR) and has-exp-features (HE) flagged sim-ilar quantities of expressions. The small amount of over-lap between SR and HE indicates that real regular expres-sions are not “too complicated” – few expressions used bothnested quantifiers and backreferences or lookahead. We dy-namically demonstrated ReDoS attacks on 55 of these regu-lar expressions using the evil input recommended by rxxr2,but did not investigate whether attackers could reach theseexpressions with evil input.

5.2 Addressing ReDoS in the EDAMany of the possibly- and truly-vulnerable expressions found in ournpm study (§5.1) need not be vulnerable in practice. Though regularexpressions that use exponential-time features are unavoidablyvulnerable, the remainder would be safe were they evaluated usinga linear-time regular expression engine instead of Node.js’s built-inexponential-time engine, Irregexp8. The size of this opportunityis illustrated in Figure 4: those not flagged by has-exp-features

(52%) could be made safe. While this observation has previouslybeen made by Cox [36], we are not aware of a previous attempt8Regular expression processing is part of the V8 JavaScript engine. V8 actually has twoengines, Atom and Irregexp. While Irregexp is used in general, Atom servicesthe simplest regular expressions, those that can be handled with strstr.

7

Page 8: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

to implement a hybrid regular expression engine at the languagelevel.

We introduced a linear-time regular expression engine intonode.cure, modifying the implementation of regular expressionsin the underlying V8 JavaScript engine. We used the RE2 regularexpression engine, which is the most prominent linear-time regularexpression engine and supports perl-compatible regular expressions(PCRE) [15]

We found that in normal usage Irregexp is significantly fasterthan RE2. We evaluated an RE2-enabled version of Node.js using twobenchmarks, Li’s benchmarks [3] and the regular expression-relatedtests from the V8 Octane benchmark suite [7] . Our results areshown in Figure 5. These benchmarks use safe input on generally-safe regular expressions, so we see that in normal usage Irregexp

outperforms RE2.

Figure 5: On the Li and V8 Octane regular expression perfor-mance benchmarks, Irregexp significantly outperforms RE2.We show here 50 runs of the performance of three versionsof Node.js: vanilla Node.js, a version that checks whetherRE2 can be used but does not use it, and a version that usesRE2whenever possible. We traced this overhead to the actualevaluation of the regular expression by the two engines.

Given this disparity in performance, node.cure only uses RE2 toevaluate potentially-vulnerable regular expressions, otherwise fa-voring Irregexp. node.cure queries the ReDoS oracles we identifiedin §5.1.1 to determine whether a regular expression is vulnerable.Querying oracles is expensive (about 0.2 seconds per query, unopti-mized), so after querying a particular pattern we cache the result.The algorithm for choosing the regular expression engine to useis shown in Figure 6. Our implementation, about 800 lines long, iscompletely within the V8 JavaScript engine used by Node.js.

Attacking Our Solution node.cure does not partition regularexpression evaluation; as in JavaScript regular expressions are ex-ecuted synchronously, and in Node.js this is done on the eventloop, node.cure only seeks to reduce the WCET of the evaluationfrom exponential-time to linear-time where possible by execut-ing vulnerable compatible regular expressions using RE2. In otherwords, node.cure follows the pattern of Figure 2 (a), but tries toexponentially reduce the size of the work.

Figure 6: Decision tree for using RE2 vs. the existing V8 regu-lar expression engines. We use RE2 as a last resort because itsperformance on “good path” patterns and inputs is inferiorto that of Irregexp according to our benchmarks. The deci-sions are made in the order of increasing cost, culminatingin a query to the ReDoS oracles. The number of regular ex-pressions from our dataset that took each path is shown in(parentheses).

This leaves servers using node.cure open to three avenues ofattack. First, though RE2 handles many vulnerable regular expres-sions, an attacker can still carry out standard ReDoS attacks on thesubset of vulnerable regular expressions that contain exponential-time features (i.e. that cannot be handled by RE2 in linear time).Second, an attacker can identify vulnerable regular expressionsthat could be handled by RE2 but which our ReDoS oracles do notidentify as vulnerable. This threat can be ameliorated by more ac-curate oracles. Lastly, an attacker can take advantage of the “slowpath” in Figure 6. If he can convince a server running node.cure toevaluate an uncached regular expression that can be handled withRE2, the server will repeatedly have to follow the “slow path” andquery the ReDoS oracles. Even with the addition of a database oforacle decisions with which prime the cache, the attacker couldpropose novel expressions not in the database. Faster ReDoS oraclesare needed to defend against this threat.

Limitations node.cure cannot protect developers from them-selves. If developers use regular expressions with exponential-timefeatures, these expressions must be handled by an exponential-time regular expression engine. Vulnerable regular expressions likethose of Table 2 will remain vulnerable when using node.cure. Thisis an unavoidable consequence of supporting a “regular expression”language that allows context-free language construction.

RE2 is not completely compatible with Irregexp. Consider theregular expression /(?:(a)|(b)|(c))+/with input abc9. Irregexpreturns a single match c, while RE2 returns three separate matchesa, b, and c10. Disagreements of this kind mean that applicationswould need to be ported to node.cure rather than using it as adrop-in replacement for Node.js. We are not aware of a previousobservation of this disagreement, nor do we understand its cause.

As flags are not the root cause of the exponential-time matching,supporting them is orthogonal to defending against EHP attacks.Though our implementation honors the case-insensitive flag (i), it9This example is taken from v8/test/mjsunit/regexp-loop-capture.js.10Perl and Python also disagree with Irregexp.

8

Page 9: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

refers patterns with other flags to V8’s default regular expressionengines. We deferred support for the global match flag (д) for futurework due to the complexity of V8’s implementation. For JavaScript’smultiline (m), unicode (u), and sticky (y) flags, we could not find anappropriate analogue in RE2.

Practicality Despite the limitations discussed above, in its cur-rent state node.cure can be used by real Node.js applications. node.curepasses V8’s regexp tests, runs our small Node.js applications, andcan also handlemore complexNode.js software like npm’s command-line utility.

5.3 Other CPU-bound VulnerabilitiesAs argued in Section §4, in the EDAC-WCET-partitioned algorithmsare safe from EHP. We examined 80 npm modules in detail to deter-mine how they handle potentially computationally expensive tasks,manually inspecting both documentation and implementation. Wechose the first 20 modules returned by npm searches for string,string manipulation, math, and algorithm, in hopes of striking abalance between relevance (high usage) and potential algorithmiccomplexity.

Though space limitations prevent a detailed description of ourfindings, they were troubling. The vast majority of the APIs exe-cuted synchronously on the event loop. Exactly two of the hun-dreds of APIs we evaluated were partitioned, and one of these syn-chronously executed an exponential-time algorithm before yieldingcontrol. Although the execution time of an API is critical to pre-venting EHP, the documentation for the majority of the APIs didnot state their running times. During our analysis of the sourcecode we identified algorithms with complexities ranging fromO(1)and O(n) all the way to O(2n ).

These findings were all the more surprising because the moduleswe selected are widely used, and were downloaded over 200 mil-lion times in December 2016. We suggest, then, that npm moduledevelopers are not cognizant of the risks of unbounded non-C-WCET-partitioned algorithms, and that many Node.js applicationsare therefore likely vulnerable to CPU-bound EHP attacks. Thesefindings are in line with previous work studying the frequencyof asynchronous callbacks in server-side JavaScript, though weidentified a lower proportion of asynchronous callbacks than theydid [47].

While Node.js application developers could offload these callsto the worker pool, e.g. using NAN or the cluster module, doingso may simply move the poison from the event loop to the workerpool §3.

6 I/O-boundEHPVulnerabilities in theEDA

This section presents I/O-bound EHP vulnerabilities. We first dis-cuss the support for file I/O in EDA frameworks (§6.1). Then, weexploit weaknesses in this support in Node.js to demonstrate I/O-bound EHP (§6.2). Finally, we describe how node.cure leveragesTrue Asynchronous I/O to eliminate these vulnerabilities (§6.3).

6.1 File I/O in the EDAThere are three Types of file I/O in the EDA. File I/O can be done(1) synchronously on the event loop, (2) “asynchronously” (syn-chronously but on the worker pool), or (3) truly asynchronously

(offloaded to the operating system’s asynchronous I/O facilities).libuv offers Type 1 and Type 2 I/O using typical read and write

APIs. Node.js also offers Type 1 and Type 2 I/O built on the libuvofferings. In addition to read/write[Sync], Node.js offers full-fileread/writeFile[Sync], and read/writeStream APIs. Twisted doesnot provide dedicated file I/O APIs, but developers could readilyimplement Type 1 or Type 2 I/O using Python’s file system support.We could not identify a correct npm or Python module offeringType 3 file I/O.

The use of type 1 I/O is frowned upon in the EDA community forobvious reasons, and developers favor the type 2 APIs offered bytheir EDA framework. We found that the Type 2 Read APIs offeredby Node.js and libuv expose applications to EHP, as discussed inthe next section.

6.2 Attacking EDA File I/OFile I/O Threat Model Consider two kinds of files: Large Files andSlow Files. Large Files are files in the MB or GB range, or effectivelybottomless ones like /dev/zero. Slow Files are those whose accesstimes are variable and potentially extremely slow, e.g. /dev/random11or files accessed over a network share.

As in §3.1, we assume that attackers can feed “evil input” tovulnerable file I/O APIs. If an attacker persuades a Node.js server toread from a Large or Slow File using Type 1 or Type 2 I/O, he maypoison the underlying event handler. This handler is either the eventloop (Type 1) or a worker thread (Type 2); either way, if the handlertakes a long time to complete the read, it cannot be used by otherclients. Though some forms of directory traversal vulnerabilitiescould be exploited in this way, this attack is somewhat more general:the EHP attacker is not interested in the contents of the file, butmerely that it be read. This threat model is analogous to that forReDoS: the attacker must be able to submit his own input to avulnerable API call.

Whenare largefiles and slowfiles problematic in theEDA?Type 1 and Type 2 APIs are unsafe under this threat model, thoughwhether they are vulnerable to large or slow files depends on theirimplementation. Focusing on Node.js, let us consider the high-levelAPIs in turn12. The Type 1 APIs are trivially unsafe, because theyperform a relatively slow task (file I/O) on the event loop. We weresurprised, however, to find that the Type 2 readFile API is vul-nerable to large files, because it is implemented as a single read

spanning the entire file. On the other hand, large files will not poi-son readFileStream, because it partitions the read into fixed-sizetasks. Between each task other pending tasks can be handled.

Unfortunately, both readFile and readFileStream can be poi-soned by slow files. The slow-ness affects each I/O request, so eachtask poisons the worker that handles it. Partitioning tasks in sizedoes not save readFileStream from a slow file’s variability in time.

Using the language of §4, neither readFileSync nor readFile ispartitioned, and though readFileStream is partitioned, it is not C-WCET partitioned. The implementation of readFileSync is shownin Figure 2 (a), readFile in (b), and readFileStream in (c). A hypo-thetical Type 3 API is shown in (e) — by offloading the work to

11According to “man 4 random”, “reads from /dev/random may block...users willusually want to open it in nonblocking mode (or perform a read with timeout).”12The read API is vulnerable to large and/or slow files depending on the size of theread. Its vulnerabilities follow the same argument.

9

Page 10: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

the operating system, none of the Event Handlers (event loop orworker) need execute I/O synchronously.

Though developers can avoid these issues by writing their ownI/O libraries to use Type 3 I/O, providing safe asynchronous fileI/O is fundamentally the responsibility of the EDA frame-work. Developers should be able to trust their framework to cor-rectly perform file I/O in an appropriately partitioned (a la read-FileStream) and truly asynchronous fashion.

In node.cure, we propose extensions to Node.js to address theseissues (§6.3). Because node.cure addresses all such vulnerabilities,we have not attempted to identify specific examples of these vul-nerabilities in the wild. Interested parties could expose some ofthem using standard taint analysis techniques, with user input assources and file system APIs as sinks. However, even untainted fileI/O can be poisonous depending on the APIs used, file sizes, andunderlying storage performance.

6.3 Addressing I/O-bound EHP in the EDAAs discussed in §6.2, the implementation of Node.js’s (libuv’s) built-in I/O APIs exposes applications to EHP vulnerabilities under ourthreat model. We have designed, implemented, and evaluated analternative approach that resolves these issues.

Node.js’s I/O design takes a portable approach, at the cost ofexposing applications to EHP attacks when doing I/O. On Linux wecan defend against this class of attacks by making use of Linux’ssupport for Kernel Asynchronous I/O (hereafter, KAIO) to offertrue asynchronous I/O13. At a high level, KAIO maintains a queueof pending I/O requests, handling each in turn, which scales wellwith the number of requests for the same reason that Node.js does.KAIO can deal with slow files more effectively than can userspace,so using KAIO for slow files won’t just move the problem to thekernel. Because KAIO scales well, Node.js could in principle useKAIO to offload I/O requests to the kernel rather than performingthem in a userspace worker pool.

Some low-level details about KAIO inform our design. KAIO of-fers an io_submitAPI to submit anAIO request, and an io_getevents

request to test for completed I/O. Though applications can repeat-edly query io_getevents to test for completion, Linux will also sig-nal a file descriptor supplied to io_submit for use with epoll. KAIOrequires callers to use file descriptors opened with the O_DIRECT flag(i.e. bypass caches), and I/O request offsets, sizes, and buffers mustbe aligned on sector boundaries.

Performing all I/O with KAIO would improve overall through-put, by letting node processes saturate the IO layer even with asmall worker pool (no limitation to k concurrent I/O requests).However, doing so would come at a cost: it would dramaticallyincrease the average response time for requests that require I/O.Average response times would rise because KAIO requires use ofthe O_DIRECT flag, which bypasses caching on both write and,critically, read. In consequence, a KAIO-only solution would of-fer throughput and EHP-proofing but dramatically increase theresponse time for requests. We observe, however, that the EHP con-cern arises only when Large or Slow Files are accessed. AddressingLarge Files is straightforward. At the libuv level, we can detect anddefer only slow I/Os to KAIO. When deciding to use KAIO, we13Windows also offers operating system-level asynchronous I/O, but we did not extendour prototype there.

follow the same principle as in our ReDoS solution (see Figure 6):we favor the existing fast path unless we have evidence that theI/O is vulnerable, at which point we drop to the slow-but-secureKAIO path.

True AIO for Node.js We apply a two-tiered approach to ad-dress large and slow files. First, though the existing implementationperforms the I/O for readFile in one large request, there is no needfor this at API level. Thus, we partition readFile requests just likethose of readFileStream, and buffer the output until the file hasbeen completely read. With this change in place, the remainingconcern for I/O is slow files.

We then add a slow path which performs I/O using KAIO ratherthan by offloading to the worker pool. With the insertion of stop-watch logic in the Node.js C++ library that offloads I/O to libuv, wecan route accesses to fast files to the existing logic and defer slowfiles to KAIO. Slow files are those that are outliers in the distributionof I/O times. When an I/O takes a long time, we obtain the numberof the underlying inode using fstat and add it to an in-memoryslow inode list. All subsequent accesses to this inode using any filedescriptor are routed to the slow path, to be performed using KAIO.We register the corresponding eventfd in the event loop’s epoll

set and respond appropriately when it is ready. Our slow path I/Ois C-WCET partitioned; it requires a small series of constant-timeoperations, carried out on the event loop. Our implementation innode.cure only affects the read path, since a system that exposes ar-bitrarywrites to an attacker has greater concerns than EHP-inducedDoS. However, it could easily be extended to the write path.

Our implementation of readFile partitioningwas simple becauseit could re-use the existing support for readFileStream, requiringonly 20 lines in the Node.js JavaScript-level file system library.Adding the “safe but slow” KAIO path to node.cure required about3500 lines in libuv, of which around 3000 was an existing hash tableimplementation [20].

EffectivenessWe used a microbenchmark on Linux to illustratethe effectiveness of node.cure’s partitioning and KAIO schemes.Our benchmark launches k read requests to saturate the workerpool. These requests are made against a 2MB file stored in an artifi-cially slow file system14. After launching the request, it attemptsto complete a CPU-bound workload consisting of 100 zip requestssubmitted serially. These requests are also offloaded to the workerpool, where they compete for worker processing time with theongoing slow read requests.

We evaluated this benchmark using readFile against a slow 2MBfile. In Figure 7 you can see that node.cure (partitioned and with theKAIO slow path) detects the slow file, moving the k read requests toKAIO and allowing the zip workload to complete quickly. With onlyread partitioning, node.cure allows the zip requests to interleavewith the partitioned slow reads, and the zip workload is able tomakeconsistent progress. This is the same performance that would beseen by an application using vanilla Node.js and the readFileStreamAPI. We also evaluated vanilla Node.js on its normal I/O path forreadFile15. Because the read is not partitioned, one of the workers

14We enforced this file system’s 100 KB/second I/O rate by creating a file system on aNetwork Block Device whose server is gated by the trickle utility [8].15To eliminate caching effects, we modified vanilla Node.js to use O_DIRECT. Thisensured that the I/O performance was as slow as the underlying storage system — andalso illustrates why using O_DIRECT is not a good idea by default.

10

Page 11: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

must complete its entire read before the zip workload’s tasks willhave a turn. At around 15 seconds one of the reads completes, afterwhich the zip workload can complete.

Figure 7 teaches us a key lesson of C-WCET partitioning: it isan all or nothing affair. Though the zip code in Node.js is perfect —C-WCET partitioned and offloaded to the worker pool — a series ofcompeting ill-partitioned tasks still blocks its forward progress.

Figure 7: In this figure we see the effect that slow I/O re-quests in Node.js have on other worker pool tasks, andthe effectiveness of node.cure in combating these requests.node.cure quickly detects and removes the slow I/O requestsfrom the worker pool, allowing the zip workload to com-plete. Without KAIO, the zip workload suffers because itshares the worker pool with the slow I/O.

Attacking Our Solution node.cure’s defense against EHP canbe attacked in a way similar to that of its defense against ReDoSEHP attacks. In each case, an attacker who can repeatedly forcenode.cure to “try new things” will trigger expensive “oracle queries”.In the context of regular expressions, the oracle query was a literalquery to our ReDoS oracles. In the file I/O setting, the oracle is thefirst slow read of a slow file. The event handler performing the readwill be poisoned until it completes (or times out, though node.cure

does not implement this), just as the event loop is poisoned whilewe query the ReDoS oracles about a new regular expression. There-fore, against both of node.cure’s defenses an attacker can level anongoing attack stream of novel requests. In doing so he shifts froma one-time request to a “slow DoS” attack [34].

Closing Thoughts With the changes in node.cure, asynchro-nous DNS queries are the only remaining built-in I/O-intensivework performed by the Node.js worker pool. If these could be re-placed with asynchronous kernel APIs, it would free the workerpool threads to focus on (hopefully C-WCET-partitioned) CPU-intensive tasks. Doing so would reduce the EHP attack surface ofNode.js applications to CPU-intensive algorithms. Unfortunately,the Linux kernel offers no asynchronous DNS resolution API16, sothis idea remains only a vision.

Our design for True EDA AIO is applicable to any EDA frame-work that currently implements I/O by offloading it to a userspaceworker pool. This includes both Twisted and libuv, and likely oth-ers. Since the KAIO portion is in libuv, our implementation actuallyprotects both Node.js- and libuv-based applications.

16libc’s getaddrinfo_a is implemented with a threadpool.

7 DiscussionInsecure Fast Path, Secure Slow PathThe software development community has long favored perfor-mance over security, so we were unsurprised that in both regularexpression engines and asynchronous file I/O, the “safe” approachwas slower than the existing “fast” approach. For practical rea-sons, in both aspects of node.cure we therefore favored the speedyexisting implementation until we knew that we had a potentialvulnerability. In the regular expression component of node.cure

this decision is made at regular expression declaration time througha query to our ReDoS oracles. In the I/O component, the decisionis made based on the file I/O speeds.

Disastrous PortabilitySoftware is often implemented with portability in mind. While thisis an admirable goal, pursuing portability can be myopic. We foundthat the Node.js and libuv frameworks favor simpler implementa-tions (regex engine) and “lowest common denominator” portability(asynchronous file I/O) to reduce the size of their code bases. Thisis a false economy.

We believe that in the EDA, framework developers must acknowl-edge EHP and provide applications with well-defined WCET wherepossible. Avoiding DoS attacks is well worth the cost of maintainingplatform-specific implementations of the same functionality. WhileEHP can be avoided by applications that correctly sanitize userinput, they do not always do so. The burden falls to frameworkdevelopers to address security vulnerabilities where they can.

Is C-WCET partitioning practical?We have argued that C-WCET partitioning is needful in the EDAto defend against EHP attacks. We also believe that it is practical atthe application level. The EDA community is already familiar withdeveloping and debugging asynchronous algorithms, and C-WCETpartitioning is simply an extension of this concept.

C-WCET partitioning requires three things of the developer: shemust identify a partitioning of the algorithm, and then must bothenforce ordering constraints and ensure her algorithm does not re-quire atomicity across partitions. We believe identifying a C-WCETpartitioning for typical algorithms will be straightforward, and theease of sharing software using npm means that a single C-WCETpartitioned algorithm can be readily used in many applications.

The EDA community is well equipped to enforce ordering con-straints using concepts like Promises (Node.js) and Futures (Twisted).Ensuring that a non-atomically-executed algorithm works correctlycan be non-trivial, but JavaScript’s closures feature gives Node.js de-velopers an easy way to privately store ongoing results. When a de-velopermakes amistake in her C-WCET partitioning and introducesordering or atomicity violations, she can draw on EDA-specific de-bugging tools that help developers detect race conditions [38, 59].

At the framework level, C-WCET partitioning is harder to achieve,particularly in complex software like Node.js. In this work we havedemonstrated two modifications to the Node.js runtime to move itcloser to our goal of C-WCET partitioning. These were non-trivialextensions requiring a deep understanding of libraries on whichNode.js relies: the V8 JavaScript engine and the libuv event-drivenframework. Though we have addressed what we view as two ma-jor sources of EHP vulnerabilities, others remain because otherframework-level aspects of Node.js still do not provide C-WCET

11

Page 12: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

partitioning. These include JavaScript language constructs like ar-ray and object accesses, which are O(N ) in the worst case, as wellas built-in Node.js modules like OS and Util.

At the application level, there is some tension in whether devel-oping C-WCET algorithms is appropriate – “To write generic code,or to write EDA-specific code?” In Node.js the answer is clear, sinceJavaScript is essentially only used in event-driven contexts (webbrowsers and Node.js). This in turn implies that all npm modulesshould be C-WCET partitioned, without exception. For EDA-basedapplications in other languages, e.g. Python’s Twisted and C’s libuv,we anticipate that C-WCET-partitioned versions of popular librarieswill need to be developed and maintained in tandem with the “nor-mal” version.

C-WCET only helps if tasks are of different sizes. Otherwisethe throughput remains the same but the response time actuallygoes up. This can be controlled with the partitioning size – pick orauto-tune towards a good default if requests are generally the samesize. Such a size is still fine because we’re protected from poisonousoutsized requests.

7.1 Alternatives to C-WCET partitioningAlongside C-WCET partitioning, we recommend the introduction ofruntime-generated TimeoutExceptions into EDA implementations.A TimeoutException would abort synchronous computation after aspecified time limit, allowing developers to use open-source mod-ules while still ensuring protection against EHP vulnerabilities.

8 Related WorkIn §2.3, we have presented previous work on AC attacks. In thissection we discuss the way in which our work relates to that ofothers.

Security in the EDA Though researchers have studied secu-rity vulnerabilities in different EDA frameworks, most prominentlyclient-side applications, the security properties of the EDA itselfhave not been researched in detail. Some examples include securitystudies on client-side JavaScript/Web [39, 53, 56, 62] and Java/An-droid [31, 43, 44, 52] applications. However, many of them havefocused on language issues (e.g., eval in JavaScript) or platform-specific issues (e.g., DOM in web browsers). We note that EDA com-plicates static analyses for event-driven Android software becausethey often require inferring control flows across event callbacks asshown in static information flow tracking [29, 45, 49, 76].

To the best of our knowledge, there are only few academic en-gagement with security aspects of server-side EDA frameworks.Ojamaa et al. [64] performed high-level survey on Node.js security,and DeGroef et al. [40] studied a secure library integration. Webelieve that more academic involvement in the EDA will increasethe security of many of the web services we use every day.

C-WCETPartitioningAt their core, operating systems are alsoevent-driven. As the bridge between userspace and hardware, theoperating system is constantly responding to events from user pro-grams and devices. An operating system’s interrupt handlers takegreat care to perform minimal amounts of work in their “top half”,deferring more expensive responses to an asynchronous “bottomhalf” [58].

C-WCET partitioning in the EDA is analogous to introducingthreads in a sequential program. [57] automatically moves Android

code into an AsyncTask, though this is inappropriate for Node.jsbecause of Node.js’s fixed-size worker pool. Work on automaticloop parallelization [61] and Speculative execution [33] might beapplicable to automatically C-WCET partitioning an algorithm.

EDA Community Awareness The EDA community is awareof the risk of DoS due to EHP on the event thread. A common rule ofthumb is “Don’t block the event loop”, advised by many tutorials aswell as recent books about EDA programming for Node.js [35, 75].Wandschneider suggests worst-case linear-time partitioning onthe event loop [75]. Casciaro advises developers to partition anycomputation on the event loop, and to offload computationally ex-pensive tasks to the worker pool [35]. In practice, however, we havefound many popular modules offering computationally-expensiveoperations without taking these recommendations to heart ( §5.3),and also showed that offloading without C-WCET partitioning isjust dumping the bomb to another place.

9 ConclusionThe Event-Driven Architecture holds significant promise for scal-ing. However, EDA-based servers are vulnerable to Event HandlerPoisoning DoS attacks, just like old single-threaded servers were.We believe that the EDA community is incautious about this vulner-ability. We have implemented two defenses against EHP attacks andassessed their success. The source code for our EHP-safer versionof Node.js can be found at BLINDED.

AcknowledgmentsHere we will acknowledge those who have contributed to our work.

References[1] Apache httpd Documentation. https://httpd.apache.org/docs/2.4/mod/. (????).[2] Apache HTTP Server. https://httpd.apache.org/. (????).[3] Benchmark of Regex Libraries. http://lh3lh3.users.sourceforge.net/reb.shtml.

(????).[4] Cylon.js. https://cylonjs.com/. (????).[5] EventMachine. (????). http://rubyeventmachine.com/[6] GitHub. https://github.com/. (????).[7] Google V8 Octane benchmark suite. https://developers.google.com/octane/

benchmark. (????).[8] How to simulate a slow hardisk. http://giallone.blogspot.com/2015/10/

simulate-slow-hardisk-in-linux.html. (????).[9] libuv. (????). https://github.com/libuv/libuvhttp://libuv.org/[10] Module Counts. (????). http://www.modulecounts.com/[11] Node-RED. http://nodered.org/. (????).[12] Node.js. http://nodejs.org/. (????).[13] Node.js Thread Pool Documentation. http://docs.libuv.org/en/v1.x/threadpool.

html. (????).[14] Projects that use libuv. https://github.com/libuv/libuv/wiki/

Projects-that-use-libuv. (????).[15] RE2 Regular Expression Engine. https://github.com/google/re2. (????).[16] Reactor. (????). https://github.com/reactor/reactor-corehttp://projectreactor.io/[17] safe-regex. https://github.com/substack/safe-regex/. (????).[18] Snyk.io. https://snyk.io/vuln/. (????).[19] Twisted. (????). https://twistedmatrix.com/trac/https://github.com/twisted/

twisted[20] uthash. http://troydhanson.github.io/uthash/. (????).[21] Zotonic. http://zotonic.com/. (????).[22] 2007. The Calendar and Contacts Server. https://github.com/Apple/

Ccs-calendarserver. (2007).[23] 2009. The Julia Language. https://github.com/JuliaLang/julia. (2009).[24] 2009. Sage Notebook (flask). https://github.com/sagemath/sagenb. (2009).[25] 2010. The Rust Programming Language. https://github.com/rust-lang/rust.

(2010).[26] 2012. Ubuntu One: Technical Details. https://wiki.ubuntu.com/UbuntuOne/

TechnicalDetails. (2012).[27] 2016. New Node.js Foundation Survey Reports New “Full Stack” In De-

mand Among Enterprise Developers. (2016). https://nodejs.org/en/blog/announcements/nodejs-foundation-survey/

12

Page 13: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

[28] Authors anonymized. 2017. The Case of the Poisoned Event Handler: Weaknessesin the Node.js Event-Driven Architecture. In European Workshop on SystemsSecurity (EuroSec).

[29] Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel,Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. Flow-Droid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware TaintAnalysis for Android Apps. In Proceedings of the 35th ACM SIGPLAN Conferenceon Programming Language Design and Implementation (PLDI ’14).

[30] Bill Atkinson. 1988. Hypercard. Apple Computer.[31] David Barrera, H. Gunes Kayacik, Paul C. van Oorschot, and Anil Somayaji. 2010.

A Methodology for Empirical Analysis of Permission-based Security Modelsand Its Application to Android. In Proceedings of the 17th ACM Conference onComputer and Communications Security (CCS ’10).

[32] Flavio Bonomi, Rodolfo Milito, Jiang Zhu, and Sateesh Addepalli. 2012. FogComputing and Its Role in the Internet of Things. In Mobile Cloud Computing(MCC). https://doi.org/10.1145/2342509.2342513

[33] Matthew J Bridges, Thomas Jablin, and David I August. 2007. Revisiting thesequential programming model for the mutlticore era. Micro 07 (2007), 12–20.https://doi.org/10.1109/MM.2008.13

[34] Enrico Cambiaso, Gianluca Papaleo, and Maurizio Aiello. 2012. Taxonomy ofSlow DoS Attacks to Web Applications. Recent Trends in Computer Networks andDistributed Systems Security (2012), 195–204.

[35] Mario Casciaro. 2014. Node.js Design Patterns (1 ed.). 454 pages. https://doi.org/10.1002/ejoc.201200111

[36] Russ Cox. 2007. Regular Expression Matching Can Be Simple And Fast (but isslow in Java, Perl, PHP, Python, Ruby, ...). (2007). https://swtch.com/~rsc/regexp/regexp1.html

[37] Scott A Crosby and Dan S Wallach. 2003. Denial of Service via AlgorithmicComplexity Attacks. In USENIX Security. http://www.usenix.org/events/sec03/tech/full_papers/crosby/crosby.pdf

[38] James Davis, Arun Thekumparampil, and Dongyoon Lee. 2017. Node.fz: Fuzzingthe Server-Side Event-Driven Architecture. In European Conference on ComputerSystems (EuroSys).

[39] Willem De Groef, Dominique Devriese, Nick Nikiforakis, and Frank Piessens.2012. FlowFox: A Web Browser with Flexible and Precise Information FlowControl (CCS ’12).

[40] Willem De Groef, Fabio Massacci, and Frank Piessens. 2014. NodeSentry: Least-privilege Library Integration for Server-side JavaScript. In Proceedings of the 30thAnnual Computer Security Applications Conference (ACSAC ’14).

[41] Ankush Desai, Vivek Gupta, Ethan Jackson, Shaz Qadeer, Sriram Rajamani, andDamien Zufferey. 2013. P: Safe asynchronous event-driven programming. InACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI). https://doi.org/10.1145/2462156.2462184

[42] Stephen C. Drye and William C. Wake. 1999. Java Swing reference. ManningPUblications Company.

[43] William Enck, Damien Octeau, Patrick McDaniel, and Swarat Chaudhuri. 2011.A Study of Android Application Security. In Proceedings of the 20th USENIXConference on Security (SEC’11).

[44] William Enck, Machigar Ongtang, and Patrick McDaniel. 2009. UnderstandingAndroid Security. IEEE Security and Privacy (2009).

[45] Yu Feng, Saswat Anand, Isil Dillig, and Alex Aiken. 2014. Apposcopy: Semantics-based Detection of Android Malware Through Static Analysis. In Proceedingsof the 22Nd ACM SIGSOFT International Symposium on Foundations of SoftwareEngineering (FSE 2014).

[46] Steve Ferg. 2006. Event-driven programming: introduction, tutorial, his-tory. (2006), 59 pages. http://parallels.nsu.ru/~fat/pthreadppt/event_driven_programming.doc

[47] Keheliya Gallaba, Ali Mesbah, and Ivan Beschastnikh. 2015. Don’t Call Us, We’llCall You: Characterizing Callbacks in Javascript. In International Symposiumon Empirical Software Engineering and Measurement (ESEM). https://doi.org/10.1109/ESEM.2015.7321196

[48] Danny Goodman and Paula Ferguson. 1998. Dynamic HTML: The DefinitiveReference (1 ed.). O’Reilly.

[49] Michael I. Gordon, Deokhwan Kim, Jeff Perkins, Limei Gilhamy, NguyenNguyenz, , and Martin Rinard. Information Flow Analysis of Android Applica-tions in DroidSafe.. InNetwork and Distributed System Security (NDSS) Symposium(NDSS 2015).

[50] Jeff Harrell. 2013. Node.js at PayPal. (2013). https://www.paypal-engineering.com/2013/11/22/node-js-at-paypal/

[51] Kasaburo Hashiguchi. 1988. Algorithms for determining relative star heightand star height. Information and Computation 78, 2 (1988), 124–169. https://doi.org/10.1016/0304-3975(91)90269-8

[52] Stephan Heuser, Adwait Nadkarni, William Enck, and Ahmad-Reza Sadeghi. 2014.ASM: A Programmable Interface for Extending Android Security. In Proceedingsof the 23rd USENIX Conference on Security Symposium (SEC’14).

[53] Xing Jin, Xuchao Hu, Kailiang Ying, Wenliang Du, Heng Yin, and Gautam NageshPeri. 2014. Code Injection Attacks on HTML5-based Mobile Apps: Characteriza-tion, Detection andMitigation. In Proceedings of the 2014 ACM SIGSAC Conferenceon Computer and Communications Security (CCS ’14).

[54] James Kirrage, Asiri Rathnayake, and Hayo Thielecke. 2013. Static Analysis forRegular Expression Denial-of-Service Attacks. Network and System Security 7873(2013), 35–148. https://doi.org/10.1007/978-3-642-38631-2_11

[55] Hugh C Lauer and Roger M Needham. 1979. On the duality of operating systemstructures. ACM SIGOPS Operating Systems Review 13, 2 (1979), 3–19. https://doi.org/10.1145/850657.850658

[56] Sebastian Lekies, Ben Stock, and Martin Johns. 2013. 25 Million Flows Later:Large-scale Detection of DOM-based XSS. In Proceedings of the 2013 ACM SIGSACConference on Computer &#38; Communications Security (CCS ’13).

[57] Yu Lin, Cosmin Radoi, and Danny Dig. 2014. Retrofitting Concurrency forAndroid Applications through Refactoring. In ACM International Symposiumon Foundations of Software Engineering (FSE). https://doi.org/10.1145/2635868.2635903

[58] Robert Love. 2005. Linux Kernel Development. Novell Press.[59] Magnus Madsen, Frank Tip, and Ondrej Lhotak. 2015. Static analysis of event-

driven Node. js JavaScript applications. InACM SIGPLAN International Conferenceon Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA).ACM.

[60] M D Mcilroy. 1999. Killer adversary for quicksort. Software - Practice and Experi-ence 29, 4 (1999), 341–344. https://doi.org/10.1002/(SICI)1097-024X(19990410)29:4<341::AID-SPE237>3.0.CO;2-R

[61] Mojtaba Mehrara, Po-Chun Hsu, Mehrzad Samadi, and Scott Mahlke. 2011. Dy-namic Parallelization of JavaScript Applications Using an Ultra-lightweight Spec-ulation Mechanism. In Proceedings of the 2011 IEEE 17th International Symposiumon High Performance Computer Architecture (HPCA ’11).

[62] Nick Nikiforakis, Luca Invernizzi, Alexandros Kapravelos, Steven Van Acker,Wouter Joosen, Christopher Kruegel, Frank Piessens, and Giovanni Vigna. 2012.You Are What You Include: Large-scale Evaluation of Remote Javascript Inclu-sions. In Proceedings of the 2012 ACM Conference on Computer and Communica-tions Security (CCS ’12).

[63] J O’Dell. 2011. Exclusive: How LinkedIn used Node.js and HTML5 to build abetter, faster app. (2011). http://venturebeat.com/2011/08/16/linkedin-node/

[64] A Ojamaa and K Duuna. 2012. Assessing the security of Node.js platform. In7th International Conference for Internet Technology and Secured Transactions(ICITST). 348–355.

[65] Senthil Padmanabhan. 2013. How We Built eBayâĂŹs First Node.jsApplication. (2013). http://www.ebaytechblog.com/2013/05/17/how-we-built-ebays-first-node-js-application/

[66] Vivek S Pai, Peter Druschel, and Willy Zwaenepoel. 1999. Flash: An Efficientand Portable Web Server. In USENIX Annual Technical Conference (ATC). https://doi.org/10.1.1.119.6738

[67] David Pariag, Tim Brecht, Ashif Harji, Peter Buhr, Amol Shukla, and David RCheriton. 2007. Comparing the performance of web server architectures. InEuropean Conference on Computer Systems (EuroSys), Vol. 41. ACM, 231–243.

[68] Asiri Rathnayake and Hayo Thielecke. 2014. Static Analysis for Regular Ex-pression Exponential Runtime via Substructural Logics. CoRR (2014). http://arxiv.org/abs/1405.7058

[69] Rick Rogers, John Lombardo, Zigurd Mednieks, and Blake Meike. 2009. Androidapplication development: Programming with the Google SDK (1 ed.). O’ReillyMedia, Inc.

[70] Alex Roichman and Adar Weidman. 2009. VAC - ReDoS Regular ExpressionDenial Of Service. OWASP (2009). https://www.checkmarx.com/white_papers/redos-regular-expression-denial-of-service/

[71] Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne. 2012. OperatingSystem Concepts (9th ed.). Wiley Publishing.

[72] Bryan Sullivan. 2010. New Tool: SDL Regex Fuzzer. (2010). https://blogs.microsoft.com/microsoftsecure/2010/10/12/new-tool-sdl-regex-fuzzer/

[73] Bryan Sullivan. 2010. Security Briefs - Regular Expression Denial of ServiceAttacks and Defenses. (2010). https://msdn.microsoft.com/en-us/magazine/ff646973.aspx

[74] Richard E Sweet. 1985. The Mesa programming environment. ACM SIGPLANNotices 20, 7 (1985), 216–229. https://doi.org/10.1145/17919.806843

[75] Marc Wandschneider. 2013. Learning Node. js: A Hands-on Guide to Building WebApplications in JavaScript. Pearson Education.

[76] Fengguo Wei, Sankardas Roy, Xinming Ou, and Robby. 2014. Amandroid: APrecise andGeneral Inter-component Data FlowAnalysis Framework for SecurityVetting of Android Apps. In Proceedings of the 2014 ACM SIGSAC Conference onComputer and Communications Security (CCS ’14).

[77] Matt Welsh, David Culler, and Eric Brewer. 2001. SEDA : An Architecture forWell-Conditioned, Scalable Internet Services. In Symposium on Operating SystemsPrinciples (SOSP). https://doi.org/10.1145/502059.502057

13

Page 14: Poisoning the Event-Driven Architecture: A Full-Stack Look at …people.cs.vt.edu/~davisjam/downloads/blog/ehp/CCS... · 2020. 6. 30. · user interface settings like web browsers

Anonymous submission #23 to ACM CCS 2017

[78] Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, StephanThesing, David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heck-mann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat,and Per Stenstrom. 2008. The Worst-Case Execution-Time Problem - Overviewof Methods and Survey of Tools. ACM Transactions on Embedded ComputingSystems (TECS) 7, 3 (2008), 36. https://doi.org/10.1145/1347375.1347389

[79] Erik Wittern, Philippe Suter, and Shriram Rajagopalan. 2016. A look at thedynamics of the JavaScript package ecosystem. In International Conference onMining Software Repositories (MSR). https://doi.org/10.1145/2901739.2901743

[80] Zhenyu Wu, Mengjun Xie, and Haining Wang. 2011. Energy Attack on ServerSystems. In USENIX Workshop on Offensive Technologies (WOOT).

14