formal foundations of serverless computing · 2020-02-27 · formal foundations of serverless...

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

Formal Foundations of Serverless ComputingAbhinav Jangda Donald Pinckney Samuel Baxter

Joseph Spitzer Breanna Devore-McDonald Yuriy Brun

Arjun GuhaUniversity of Massachusetts Amherst

AbstractA robust, large-scale web service can be di�cult to engineer.When demand spikes, it must con�gure new machines andmanage load-balancing; when demand falls, it must shutdown idle machines to reduce costs; and when a machinecrashes, it must quickly work around the failure without los-ing data. In recent years, serverless computing, a new cloudcomputing abstraction, has emerged to help address thesechallenges. In serverless computing, programmers writeserverless functions, and the cloud platform transparentlymanages the operating system, resource allocation, load-balancing, and fault tolerance. In 2014, AmazonWeb Servicesintroduced the �rst serverless platform, AWS Lambda, andsimilar abstractions are now available on all major clouds.Unfortunately, the serverless computing abstraction ex-

poses several low-level operational details that make it hardfor programmers to write and reason about their code. Thispaper sheds light on this problem by presenting � , an op-erational semantics of the essence of serverless computing.Despite being a small core calculus (less than one column),� models all the low-level details that serverless functionscan observe. To show that � is useful, we present threeapplications. First, to make it easier for programmers to rea-son about their code, we present a simpli�ed semantics ofserverless execution and precisely characterize when thesimpli�ed semantics and � coincide. Second, we augment� with a key-value store, which allows us to reason aboutstateful serverless functions. Third, since a handful of server-less platforms support serverless function composition, weshow how to extend � with a composition language. Wehave implemented this composition language and show thatit outperforms prior work.

1 IntroductionServerless computing, also known as functions as a service, is anew approach to cloud computing that allows programmersto run event-driven functions in the cloud without the needto manage resource allocation or con�gure the runtime en-vironment. Instead, when a programmer deploys a serverlessfunction, the cloud platform automatically manages depen-dencies, compiles code, con�gures the operating system,

and manages resource allocation. Unlike virtual machinesor containers, which require low-level system and resourcemanagement, serverless computing allows programmers tofocus entirely on application code. If demand for the functionsuddenly increases, the cloud platform may transparentlyload another instance of the function on a new machine andmanage load-balancing without programmer intervention.Conversely, the platform will transparently terminate under-utilized instances when demand for the function falls. In fact,the cloud provider may shutdown all running instances ofthe function if there is no demand for an extended period oftime. Since the platform completely manages the operatingsystem and resource allocation in this manner, serverlesscomputing is a language-level abstraction for programmingthe cloud.An economic advantage of serverless functions is that

they only incur costs for the time spent processing events.Therefore, a function that is never invoked incurs no cost.By contrast, virtual machines incur costs when they areidle, and they need idle capacity to smoothly handle unex-pected increases in demand. Serverless computing allowscloud platforms to more easily optimize utilization acrossseveral customers, which increases hardware utilization andlowers costs (e.g., by lowering energy consumption).

Amazon Web Services introduced the �rst serverless com-puting platform, AWS Lambda, in 2014, and similar abstrac-tions are now available from all major cloud providers [1,14, 22, 26, 29, 31]. Serverless computing has seen rapid adop-tion [11], and programmers now often use serverless com-puting to write short, event-driven computations, such asweb services, backends for mobile apps, and aggregators forIoT devices.In the research community, there is burgeoning interest

in developing new programming abstractions for server-less computing, including abstractions for big data process-ing [4, 18, 28], modular programming [7], information �owcontrol [2], chatbot design [8], and virtual network func-tions [40]. However, serverless computing has signi�cant,peculiar characteristics that prior work does not address.

The serverless computing abstraction, despite its many ad-vantages, exposes several low-level operational details thatmake it hard for programmers to write and reason abouttheir code. For example, to reduce latency, serverless plat-forms try to reuse the same function instance to process

1

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

Abhinav Jangda, Donald Pinckney, Samuel Baxter, Joseph Spitzer, Breanna Devore-McDonald, Yuriy Brun, and Arjun Guha

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

multiple requests. However, this behavior is not transparent,and it is easy to write a serverless function that producesincorrect results or leaks con�dential data when reused. Arelated problem is that serverless platforms abruptly termi-nate function instances when they are idle, which can leadto data loss if the programmer is not careful. To tolerate net-work and system failures, serverless platforms automaticallyre-execute functions on di�erent machines. However, it isup to the programmer to ensure that their functions performcorrectly when they are re-executed, which may include con-current re-execution when a transient failure occurs. Theseproblems are exacerbated when an application is composedof several functions. In summary, serverless functions are notfunctions in any typical sense of the word.

Our contributions. This paper presents a formal founda-tion for serverless computing. Based on our experience withseveral major serverless computing platforms (Google CloudFunctions, Apache OpenWhisk, and AWS Lambda), we havedeveloped a detailed, operational semantics of serverlessplatforms, called � , which manifests the essential low-levelbehaviors of these serverless platforms, including failures,concurrency, function restarts, and instance reuse. The de-tails of � matter because they are observable by programs,but programmers �nd it hard to write code that correctlyaddresses all of these behaviors. By elucidating these behav-iors, � can guide the development of programming toolsand extensions to serverless platforms. To demonstrate that� is useful, this paper presents three distinct case studiesthat employ it.First, we address the problem that it can be hard for pro-

grammers to reason about the complex, low-level behavior ofserverless platforms, which � models. We present an alter-native, naive semantics of serverless computing, that elidesthese complex behaviors, and precisely characterize when itis safe for a programmer to use the naive semantics insteadof � . Speci�cally, we establish a weak bisimulation between� and the naive semantics that is conditioned on simplesemantic criteria. This result helps programmers con�dentlyabstract away the low-level details of serverless computing.Second, we address the fact that serverless functions fre-

quently use a cloud-hosted key-value store to share state.We have designed � to be an extensible semantics, so weextend it with a model of a key-value store. Using this exten-sion, we precisely characterize idempotence for serverlessfunctions, which allows programmers to reason using anextended naive semantics, where each event is processed inisolation. The recipe that we follow to add a key-value storeto � could also be used to augment � with models of othercloud computing services.

Third, we account for serverless platforms that go beyondrunning individual serverless functions and also supportserverless function orchestration. We extend � to supporta serverless programming language (��), which has a suite

of I/O primitives, data processing operations, and composi-tion operators that can run safely and e�ciently without theneed for operating system isolation mechanisms. To performoperations that are beyond the scope of ��, we allow ��programs to invoke existing serverless functions as blackboxes. We implement ��, evaluate its performance, and �ndthat it can outperform a popular alternative in certain com-mon cases. Using case studies, we show that �� is expressiveand easy to extend with new features.Our hope is that � will be a foundation for further re-

search on language-based abstractions for serverless comput-ing. The case studies that we present are detailed examplesthat show how to to design new abstractions and study ex-isting abstractions for serverless computing, using � .

The rest of this paper is organized as follows. §2 presentsan overview of serverless computing, and a variety of is-sues that arise when writing serverless code. §3 presents � ,our formal semantics of serverless computing. §4 presentsa simpli�ed semantics of serverless computing and provesexactly when it coincides with � . §5 augments � with akey-value store. §6 and §7 extend � with a language forserverless orchestration. Finally, §8 discusses related workand §9 concludes.

2 Overview of Serverless ComputingTo motivate the need for a formal foundation of serverlesscomputing, consider the serverless function in Figure 1a.1During deployment, the programmer can specify the typesof events that will trigger the function, such as, messageson a message-bus, updates to a database, or web requests toa ��. The function receives the event as a ��-formattedobject. In this example, the type �eld of the event deter-mines whether the function deposits money to an accountor transfers money between accounts. The function also re-ceives a callback, which it must call when event processingis complete. The callback allows the function to run asyn-chronously, although our example is presently synchronous.For brevity, we have elided authentication, authorization,error handling, and most input validation. However, thisfunction su�ers several other problems because it is notproperly designed for serverless execution.

Ephemeral state. The �rst problem arises after a few min-utes of inactivity: all updates to the accounts global variableare lost. This problem occurs because the serverless platformruns the function in an ephemeral container, and silentlyshuts down the container when it is idle. The exact time-out depends on overall load on the cloud provider’s infras-tructure, and is not known to the developer. Moreover, thefunction does not receive a noti�cation before a shut down

1The examples in this paper are in JavaScript— the language that is mostwidely supported by serverless platforms— and arewritten for Google CloudFunctions. However, it is easy to port our examples to other languages andserverless platforms.

2

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

Formal Foundations of Serverless Computing

1 let accounts = new Map();

2 exports.bank = function(req, res) {

3 if (req.body.type === �deposit�) {

4 accounts.set(req.body.name, req.body.value);

5 res.send(true);

6 } else if (req.body.type === �transfer�) {

7 let { from, to, amnt } = req.body;

8 if (accounts.get(from) >= amnt) {

9 accounts.set(to, accounts.get(to) + amnt);

10 accounts.set(from, accounts.get(from) - amnt);

11 res.send(true);

12 } else {

13 res.send(false); }}};

(a) An incorrect implementation that does not address low-leveldetails of serverless execution.1 let Datastore = require(�@google-cloud/datastore�);

2 exports.bank = function(req, res) {

3 let ds = new Datastore({ projectId: �bank-app� });

4 let dst = ds.transaction();

5 dst.run(function() {

6 let tId = ds.key([�Transaction�, req.body.transId]);

7 dst.get(tId, function(err, trans) {

8 if (err || trans) {

9 dst.rollback(function() { res.send(false); });

10 } else if (req.body.type === �deposit�) {

11 let to = ds.key([�Account�, req.body.to]);

12 dst.get(to, function(err, acct) {

13 acct.balance += req.body.amount;

14 dst.save({ key: to, data: acct });

15 dst.save({ key: tId, data: {} });

16 dst.commit(function() { res.send(true); });

17 });

18 } else if (req.body.type === �transfer�) {

19 let amnt = req.body.amount;

20 let from = ds.key([�Account�, req.body.from]);

21 let to = ds.key([�Account�, req.body.to]);

22 dst.get([from, to], function(err, accts) {

23 if (accts[0].balance >= amnt) {

24 accts[0].balance -= amnt;

25 accts[1].balance += amnt;

26 dst.save([{ key: from, data: accts[0] },

27 { key: to, data: accts[1] }]);

28 dst.save({ key: tId, data: {} });

29 dst.commit(function() { res.send(true); });

30 } else {

31 dst.rollback(function() { res.send(false);

32 });}});}}});});};

(b) A correct implementation that addresses instance termination,concurrency, and idempotence.

Figure 1. Serverless functions for banking.

occurs. Similarly, the platform automatically starts a newcontainer when a event eventually arrives, but all state islost. Therefore, the function must serialize all state updatesto a persistent store. In serverless environments, the localdisk is also ephemeral, therefore our example function mustuse a network-attached database or storage system.

Implicit parallel execution. The second problem ariseswhen the function receives multiple events over a shortperiod of time. When a function is under high load, theserverless platform transparently starts new instances of the

function and manages load-balancing to reduce latency. Eachinstance runs in isolation (in a container), there is no guaran-tee that two events from the same requester will be processedby the same instance, and instances may process eventsin parallel. Therefore, a correct implementation should usetransactions to correctly manage this implicit parallelism.

At-least-once execution. The third problem arises whenfailures occur in the serverless computing infrastructure.Serverless platforms are distributed systems that are de-signed to re-invoke functions when a failure is detected.2However, if the failure is transient, a single event may beprocessed to completion multiple times. Most platforms runfunctions at least once in response to a single event, whichcan cause problems when functions have side-e�ects [3, 23,30, 32]. In our example function, this would duplicate de-posits and transfers. We can �x this problem in several ways.A common approach is to require each request to have aunique identi�er, maintain a consistent log of processed iden-ti�ers in the database, and ignore requests that are alreadyin the log.

Figure 1b �xes the three problems mentioned above. Thisimplementation uses a cloud-hosted key-value store for per-sistence, uses transactions to address concurrency, and re-quires each request to have a unique identi�er to supportsafe re-invocation. This version of the function is more thantwice as long as the original, and requires the programmer tohave a deep understanding of the serverless execution model.The next section presents an operational semantics of server-less platforms (� ) that succinctly describes its peculiarities.Using � , the rest of the paper precisely characterizes thekinds of properties that are needed for serverless functions,such as our example, to be correct.

3 Semantics of Serverless ComputingWe now present our operational semantics of serverless plat-forms (� ) that captures the essential details of the serverlessexecution model, including concurrency, failures, executionretries, and function instance reuse. Figure 2 presents �in full, and is divided into two parts: (1) a model of server-less functions, and (2) a model of a serverless platform thatreceives requests (i.e., events) and produces responses by run-ning serverless functions. The peculiar features of serverlesscomputing are captured by the latter part.

Serverless functions. Serverless platforms allow program-mers to write serverless functions in a variety of sourcelanguages, but platforms themselves are source-language ag-nostic. Most platforms only require that serverless functionsoperate asynchronously, process a single request at a time,

2Even if the serverless platform does not re-invoke functions itself, there aresituations where the external caller will re-invoke a function to workarounda transient failure. For example, this arises when functions are triggered by�� requests.

3

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330


331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

Serverless Functions hf , �0, �, recv, stepiFunction name f B · · ·Internal states �B · · ·Initial state �0 2 �Receive event recv 2 � ⇥ �! �Internal step step 2 �! � ⇥ t With e�ect tValues � B · · · ��, ��, etc.Commands t B �

| return(� ) Return value

Serverless PlatformRequest �� x B · · ·Instance �� B · · ·Execution mode mB idle Idle

| busy(x ) Processing xTransition labels `B Internal transition

| start(x, � ) Receive �| stop(x, � ) Respond �

Components C B F(f ,m, � , � ) Function instance| R(f , x, � ) Apply f to �| S(x, � ) Respond with value �

Component set CB {C1, · · · , Cn }

C`7�! C

R��x is fresh

Cstart(x,� )========) CR(f , x, � )

C�� is fresh recv(�, �0) = �

CR(f , x, � ) ) CR(f , x, � )F(f , busy(x ), � , � )

W��recv(�, � ) = � 0

CR(f , x, � )F(f , idle, � , � ) ) CR(f , x, � )F(f , busy(x ), � 0, � )

H��step(� ) = (� 0, � )

CF(f , busy(x ), � , � ) ) CF(f , busy(x ), � 0, � )

R��step(� ) = (� 0, return(� 0))

CR(f , x, � )F(f , busy(x ), � , � )stop(x,� 0)========) CF(f , idle, � , � )S(x, � 0)

D��CF(f ,m, � , � ) ) C

Figure 2. � : An operational model of serverless platforms.

and usually consume and produce �� values. These are thefeatures that our abstract model includes (Figure 2).

In our model, the operation of a serverless function (f ) isde�ned by two functions: a function that receives a request(recv), and a function that takes an internal step (step) thatmay produce a command for the serverless platform (t ). Fornow, the only valid command is return(� ), which indicatesthat the response to the last request is the value � .3 Thestate of a serverless function (� ) is abstract to the serverlessplatform. However, there is a distinguished initial state (�0)in which the platform launches a serverless function. In3 §5 extends our model with new commands.

practice, the initial state depends on the source language.For example, if f were written in JavaScript then �0 wouldbe the initial JavaScript heap.

Serverless platform. � is an operational semantics of aserverless platform that processes several concurrent re-quests. � is written in a process-calculus style, where thestate of the platform consists of a collection of running or idlefunctions (known as function instances), pending requests,and responses. A new request may arrive at any time for aserverless function f , and each request is given a globallyunique identi�er x (the R�� rule). However, the platformdoes not process requests immediately. Instead, at some laterstep, the platform may either cold-start a new instance off (the C�� rule), or it may warm-start by processing therequest on an existing, idle instance of f (the W�� rule).The internal steps of a function instance are unobservable(the H�� rule). The only observable command that aninstance can produce is to respond to a pending request(the R�� rule). When an instance responds to a request, �produces a response object, an observable response event,and marks the function instance as idle, which allows it tobe reused. Finally, a function instance may die at any timewithout noti�cation (the D�� rule).

� captures several subtle behaviors that occur in server-less platforms. (1) The platform produces an observable event(start(x ,� )) when it receives a request (R��), and not whenit starts to process the request. This is necessary because theplatform may start several instances for a single event x , forexample, if the platform detects a potential failure. (2) Duringa warm-start, it does not re-initialize the state. Instead, itapplies recv to whatever last state the instance had. (3) Theplatform can cold-start or warm-start a function instancefor any pending request at any time, even if the request iscurrently being processed. In practice, this behavior occursduring a transient failure, when an instance becomes tem-porarily unreachable. (4) After responding to a request, theplatform does not discard function state (R��), whichallows functions to cache results across invocations. (5) Iftwo instances are processing a single request x and one func-tion responds, the other function will eventually get stuckbecause there is no longer a request for it to respond to. Thestuck function will eventually terminate (D��). (6) Instancetermination is not observable and may occur due to failure,or when the platform reclaims the resources held by an in-stance that is idle or stuck (D��). (7) There is no rule in � torestart function execution after it dies. Instead, the semanticscan nondeterministically launch a new instance at any time.

In summary, � succinctly models the low-level details ofserverless platforms. The semantics manifests the subtletiesthat make serverless programming hard. The next sectionuses � to present a simpler model to make serverless pro-gramming easier.

4

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440


Naive function state AB hf ,m, *� i

A `7�! A

N�S��x is fresh recv(�, �0) = � 0

hf , idle, *� i start(x,� )7��! hf , busy(x ), [�0, � 0]i

N�S��step(� ) = (� 0, � )

hf , busy(x ), *� ++[� ]i 7! hf , busy(x ), *� ++[� , � 0]i

N�S��step(� ) = (� 0, return(� ))

hf , busy(x ), *� ++[� ]i stop(x,� )7��! hf , idle, [� ]i

Figure 3. A naive semantics of serverless functions.

4 A Simpler Serverless SemanticsA natural way to make serverless programming easier is tosimplify the model. For example, we could execute functionsexactly once, or avoid reusing state. Unfortunately, imple-menting these changes is likely to be expensive (and, in manysituations, beyond our control). Therefore, the approach thatwe take in this section is to de�ne a simpler, naive semanticsof serverless computing. Moreover, using a weak bisimu-lation theorem, we precisely characterize when the naivesemantics and � coincide. This theorem addresses the low-level details of serverless execution once and for all, thusfreeing programmers to reason using the naive semanticsinstead of � .In the naive serverless semantics (Figure 3), serverless

functions (f ) are the same as the serverless functions in � .However, the operational semantics of the naive platform ismuch simpler: the platform runs a single function f on onerequest at a time, without any concurrency or failures. Ateach step of execution, the naive semantics either (1) startsprocessing a new event if the platform is idle, (2) takes aninternal step if the platform is busy, or (3) responds andreturns to being idle. The state of a naive platform consistsof (1) the function’s name (f ); (2) its execution mode (m); and(3) a trace of function states (*� ), where the last element of thetrace is the current state, and the �rst element was the initialstate of the function instance when it received the currentrequest. The trace is a convenience that helps us relate thenaive semantics to � , but has no e�ect on execution.

Serverless function correctness. Note that the naive se-mantics is an idealized model and is not correct for arbitraryserverless functions. However, we can precisely characterizethe exact conditions when it is safe for a programmer to rea-son with the naive semantics, even if their code is running ona full-�edged serverless platform (i.e., using � ). We requirethe programmer to de�ne a safety relation (R) over the stateof the serverless function. At a high-level, the programmermust ensure that (1) R is an equivalence relation, (2) pairsof equivalent states step to pairs of equivalent states, and(3) that the serverless function produces the same observable

command (if any) on equivalent states. The safety relation isformally de�ned as follows.

De�nition 4.1 (Safety Relation). For a serverless functionhf ,�0, �, recv, stepi, the relationR ✓ �⇥� is a safety relationif:

1. R is an equivalence relation,2. for all (�1,�2) 2 R and � , (recv(�,�1), recv(�,�2)) 2R,

3. for all (�1,�2) 2 R , if (� 01, t1) = step(�1) and (� 02, t2) =step(�2) then (� 01,�

02 ) 2 R and t1 = t2.

Weak bisimulation. We now prove a weak bisimulationbetween the naive semantics and � , conditioned on theserverless functions satisfying the safely relation de�nedabove. We prove a weak (rather than a strong) bisimulationbecause � models serverless execution inmore detail. There-fore, a single step in the naive semantics may correspond toseveral steps in � . The theorem below states that the naivesemantics and � are indistinguishable to the programmer,modulo unobservable steps.To establish the weak bisimulation, we de�ne a relation

(A ⇡ C) that precisely describes when naive state (A) isequivalent to a � state (C), as follows.

De�nition 4.2 (Bisimulation Relation). A ⇡ C is de�nedas:

1. hf , idle,*� i ⇡ C, and2. If for all F( f , busy(x ),� ,�) 2 C, 9� 0.� 0 2 *� such that

(� ,� 0) 2 R then hf , busy(x ),*� i ⇡ R( f ,x ,� ) C.

This relation formally captures several key ideas that arenecessary to reason about serverless execution. (1) A singlenaive state may be equivalent to multiple distinct � states.This may occur due to failures and restarts. (2) Conversely,a single � state may be equivalent to several naive states.This occurs when a serverless platform is processing severalrequests. In fact, we require all � states to be equivalentto all idle naive states, which is necessary for � to receiverequests at any time. (3) The � state may have several func-tion instances evaluating the same request. (4) Due to warmstarts, the state of a function may not be identical in thetwo semantics; however, they will be equivalent (per R).(5) Due to failures, the � semantics can “fall behind” thenaive semantics during evaluation, but the state of any func-tion instance in � will be equivalent to some state in theexecution history of the naive semantics. The weak bisimu-lation theorem accounts for failures, by specifying the seriesof � steps needed to then catch up with the naive semanticsbefore an observable event occurs.

An important simpli�cation in the naive semantics is thatit executes a single request at a time. Therefore, to relatea naive execution to a � execution, we need to �lter outevents that are generated by other requests. To do so, we

5

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495


496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

de�ne x (*`) as the sub-sequence of ` that only consists of

events labeled x .We can now state the weak bisimulation theorem.

Theorem 4.3 (Weak Bisimulation). For a serverless functionf with a safety relation R, for all A, C, `:

1. For all A 0, if A `7�! A 0 and A ⇡ C then there exists*`1,

*`2, C0, Ci and Ci+1 such that C )

*`1=) Ci

`=) Ci+1 )

*`2=) C0,

x (*`1) = � , x (

*`2) = � , and A 0 ⇡ C0

2. For all C0, if C`=) C0 and A ⇡ C then there exists A 0

such that A 0 ⇡ C0 and A 7�!`�! A 0.Proof. By Theorems A.4 and A.5 in the appendix (part of thesubmitted supplemental material). ⇤

The �rst part of this theorem states that every step inthe naive semantics corresponds to some sequence of stepsin � . We can interpret this as the sequence of steps that aserverless platform needs to execute to faithfully implementthe naive semantics. On the other hand, the second part ofthis theorem states that any arbitrary step in � — includingfailures, retries, and warm starts — corresponds to a (possiblyempty) sequence of steps in the naive semantics.In summary, this theorem allows programmers to justi�-

ably ignore the low-level details of � , and simply use thenaive semantics, if their code satis�es De�nition 4.1. Thereare now several tools that are working toward verifyingthese kinds of properties in scripting languages, such asJavaScript [19, 34], which is the most widely supported lan-guage for writing serverless functions. Our work, whichis source language-neutral, complements this work by es-tablishing the veri�cation conditions necessary for correctserverless execution.

5 Serverless Functions and Cloud StorageIt is common for serverless functions to use an externaldatabase for persistent storage because their local state isephemeral. But, serverless platforms warn programmers thatstateful serverless functions must be idempotent [23, 30, 32].In other words, they should be able to tolerate re-execution.Unfortunately, it is completely up to programmers to ensurethat their code is idempotent, and platforms do not pro-vide a clear explanation of what idempotence means, giventhat serverless functions perform warm-starts, execute con-currently, and may fail at any time. We now address theseproblems by adding a key-value store to both � and thenaive semantics, and present an extended weak bisimula-tion. In particular, the naive semantics still processes a singlerequest at a time, which is a convenient mental model forprogrammers.

Figure 4 augments � with a key-value store that supportstransactions. To the set of components, we add exactly onekey-value store (D(M,L)), which has a map from keys to

Serverless FunctionsKey set k B stringsKey-value map M 2 k * �Commands t B · · ·

| beginTx Lock data store| read(k ) Read value| write(k, � ) Write value| endTx Unlock data store

Component set CB {C1, · · · , Cn, D(M, L) }Lock state LB free Data store is free

| owned(�, M ) Owned by �

R��step(� ) = (� 0, read(k )) recv(M 0(k ), � 0) = � 00

CF(f , � , busy(x ), � )D(M, owned(�, M 0))) CF(f , � 00, busy(x ), � )D(M, owned(�, M 0))

W��step(� ) = (� 0, write(k, � )) M 00 = M 0[k 7! �]

CF(f , � , busy(x ), � )D(M, owned(�, M 0))) CF(f , � 0, busy(x ), � )D(M, owned(�, M 00))

B��T�step(� ) = (� 0, beginTx)

CF(f , � , busy(x ), � )D(M, free)) CF(f , � 0, busy(x ), � )D(M, owned(�, M ))

E��T�step(� ) = (� 0, endTx)

CF(f , � , busy(x ), � )D(M, owned(�, M 0))) CF(f , � 0, busy(x ), � )D(M 0, free)

D��T�8f �x . F(f , � , busy(x ), � ) < C

CD(M, owned(�, M 0)) ) CD(M, free)

Figure 4. � augmented with a key-value store.

values (M) and a lock (L), which is either unlocked (free) orcontains uncommitted updates from the function instancethat holds the lock (owned(�,M 0)). Note that the lock isheld by a function instance and not a request, since theremay be several running instances processing the same re-quest. We allow serverless functions to produce four newcommands: beginTx starts a transaction, endTx commits atransaction, read(k ) reads the value associated with key k ,and write(k,� ) sets the key k to value � . We add four newrules to � that execute these commands in the natural way:B��T� blocks until it can acquire a lock, E��T� commitschanges and releases a lock, and for simplicity, the R��andW�� rules require the running instance to have a lock.Finally, we need a �fth rule (D��T�) that releases a lock anddiscards its uncommitted changes if the function instancethat held the lock no longer exists. This may occur if thefunction instance dies before committing its changes.

Idempotence in the naive semantics. There are severalways to ensure that a serverless function is idempotent. Acommon protocol is to commit each output value, keyedby the unique request ��, to the key-value store, within atransactional update. Therefore, if the request is re-tried,

6

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605


Serverless FunctionsOptional key-value map HM B lo�ed(M, *� ) | commit(� ) | ·HM, A `7�! HM, A

A 7! A0HM, A 7! HM, A0

A start(x,� )7��! A0

·, A start(x,� )7��! ·, A0A stop(x,� )7��! A0

commit(� ), A stop(x,� )7��! ·, A0

N�R��step(� ) = (� 0, read(k )) recv(M (k ), � 0) = � 00

lo�ed(M, *� 0), hf , busy(x ), *� ++[� ]i7! lo�ed(M, *� 0), hf , busy(x ), *� ++[� , � 0, � 00]i

N�W��step(� ) = (� 0, write(k, � )) M 0 = M[k 7! �]

lo�ed(M, *� 0), hf , busy(x ), *� ++[� ]i7! lo�ed(M 0, *� 0), hf , busy(x ), *� ++[� , � 0]i

N�B��T�step(� ) = (� 0, beginTx)

·, hf , busy(x ), *� ++[� ]i7! lo�ed(M, *� ++[� ]), hf , busy(x ), *� ++[� , � 0]i

N�E��T�step(� ) = (� 0, endTx)

lo�ed(M, *� 0), hf , busy(x ), *� ++[� ]i7! commit(M (x )), hf , busy(x ), *� ++[� , � 0]i

N�R��lo�ed(M, *� 0), hf , busy(x ), *� i 7! ·, hf , busy(x ), *� 0i

Figure 5. The naive semantics with a key-value store.

the function can lookup and return the saved output value.We now formally characterize this protocol, and use it toprove a weak bisimulation theorem between � and thenaive semantics, where each is extended with a key-valuestore. This will allow programmers to reason about serverlessexecution using the naive semantics, which processes exactlyone request at a time, without concurrency.The challenge we face is to extend the bisimulation rela-

tion (De�nition 4.2) to account for the key-value store. Inthat de�nition, when the � state (C) and the naive state(A) are equivalent (A ⇡ C), it is possible for a failure tooccur (C ) C0), such that after the failure, � falls behindthe naive semantics. Nevertheless, we still treat the statesas equivalent (A ⇡ C0), and let the weak bisimulation proofre-invoke a function in C0 until it catches up with A. Un-fortunately, for an arbitrary serverless function, replay doesnot work because the key-value store may have changed.To address this, we need to ensure that functions that usethe key-value store follow an appropriate protocol to ensureidempotence. A more subtle problem arises when the naivestate (A) is within a transaction, and the equivalent � statetakes several steps (C)) C0) that result in a failure, followedby other updates to the key-value store. If this occurs, thenaive semantics must rollback to the start of the transactionand re-execute with the key-value store in C0.Figure 5 shows the extended naive semantics, which ad-

dresses these issues. In this semantics, the naive key-value

store (HM) goes through three states: (1) at the start of exe-cution, it is not present; (2) when a transaction begins, thesemantics selects a new mapping nondeterministically; and(3) when the transaction completes, the mapping moves to acommitted state, where it only contains the �nal result. Forsimplicity, we assume that reads andwrites only occurwithintransactions. The semantics also includes aN�R�� rule,which allows execution to rollback to the start of transac-tion. However, once a transaction is complete (N�E��T�), arollback is not possible.The extended bisimulation relation, shown below, uses

the bisimulation relation from the previous section (De�ni-tion 4.2). When the naive semantics is within a transaction,the relation requires some instance in � to be operating inlock-step with the naive semantics. However, it allows otherinstances that are not using the key-value store to makeprogress. Therefore, a transaction is not globally atomic in� , and other requests can be received and processed whilesome instance is in a transaction.

De�nition 5.1 (Extended Bisimulation Relation). HM,A ⇡D(M,L),C is de�ned as:

1. If A ⇡ C then ·,A ⇡ D(M, free)C, or2. If A = hf , busy(x ),*�++[� ]i, L = owned(�,M 0), HM =

lo�ed(M 0,*� ), and there exists F( f ,H� , busy(x ),�) 2C such that (� ,H� ) 2 R then HM,A ⇡ D(M,L),C, or

3. If A = hf , busy(x ),*� i, HM = commit(� ), and M (x ) =� then HM,A ⇡ D(M, free)C.

With this de�nition in place, we can prove an extendedweak bisimulation relation.

Theorem 5.2 (Extended Weak Bisimulation). For a server-less function f , if R is its safely relation and for all requestsR( f ,x ,� ) the following conditions hold:

1. f produces the value stored at key x , if it exists,2. When f completes a transaction, it stores a value � 0 at

key x , and3. When f stops, it produces � 0,

then, for all HM , A, C, ` :

1. For all HM0,A 0, if HM,A `7�! HM

0,A 0 and A ⇡ C then

there exists*`1,*`2, C0, Ci and Ci+1 such that C)

*`1=) Ci

`=)

Ci+1 )*`1=) C0, x (*`1) = � , x (

*`2) = � , and A 0 ⇡ C0

2. For all C0, if C`=) C0 and HM,A ⇡ C then there exists

A 0 such that A 0 ⇡ C0 and A 7�!`�! A 0.Proof. By Theorems A.9 and A.10 in the appendix (part ofthe submitted supplemental material). ⇤

The theorem statement in the appendix formalizes theconditions of the bisimulation, but the less formal conditionsare useful for programmers, since they are simple require-ments that are easy to ensure. Therefore, this theorem gives

7

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660


661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

the assurance that the reasoning with the naive semanticsis adequate, even though the serverless platform operatesusing � .

6 Serverless CompositionsThus far, we have discussed the basic serverless computingabstraction, where serverless functions are black-boxes tothe platform. However, there are many situations where thisabstraction makes it hard to write serverless programs in amodular way [7]. This section extends � with a signi�cantnew feature: a serverless programming language (��) forcomposing serverless functions that is designed to directlyrun on a serverless platform (i.e., without operating systemvirtualization). We motivate the need for �� (§6.1), presentthe design of �� (§6.2), and discuss our implementation(§6.3). Then, §7 evaluates ��’s performance.

6.1 The Need for Serverless Composition LanguagesBaldini et al. [7] present several problems that arise whenprogrammers build new serverless functions by composingexisting serverless functions. The natural approach to server-less function composition is to have one serverless functionact as a coordinator that invokes other functions. For ex-ample, Figure 6 shows a serverless function that depositsfunds into two accounts by calling our example function(bank) from §2. A problem with this approach is that theprogrammer gets “double-billed” for the time spent runningthe coordinator function (deposit2) which is mostly idleand the time spent doing actual work in bank. An alternativeapproach is to merge several functions into a single function.Unfortunately, this approach hinders code-reuse. In partic-ular, it does not work when source code is unavailable orwhen the serverless functions are written in di�erent lan-guages. A third approach is to write serverless functions thateach pass their output as input to another function, insteadof returning to the caller (i.e., continuation-passing style).However, this approach requires rewriting code. Moreover,some clients, such as web browsers, cannot produce a contin-uation �� to receive the �nal result. Baldini et al. show thatthe only way to address these problems is to add primitivecomposition operators to serverless platforms [7].

6.2 Composing Serverless Functions with ArrowsWe now extend � with a domain speci�c language for com-posing serverless functions, which we call �� (serverlessprogramming language). Since the serverless platform is ashared resource and programs are untrusted, �� cannot runarbitrary code. However, �� programs can invoke serverlessfunctions to perform arbitrary computation when needed.Therefore, invoking a serverless function is a primitive op-eration in ��, which serves as the wiring between severalserverless functions.

1 let request = require(�request-promise-native�);

2 exports.deposit2 = function(req, res) {

3 let { amount, tId1, tId2 } = req.body;

4 if (amount > 100) {

5 request.post({ url: ..., json: { type: �deposit�,

6 to: �checking�, amount: amount - 100,

7 transId: tId1 }})

8 .then(function() {


10 to: �savings�, amount: 100, transId: tId2 }})

11 .then(function() { res.send(true); });});

12 } else {


14 to: �checking�, amount: amount, transId: tId1 }})

15 .then(function() { res.send(true); });}};

Figure 6.A serverless function to deposit funds, based on theamount provided, using the serverless function in Figure 1b.

Values � B · · ·| (�1, �2) Tuples

�� expressions e B invoke f Invoke serverless function| first e Run e to �rst part of input| e1 >>> e2 Sequencing

�� continuations � B ret x Response to request| seq e � In a sequence| first � � In first

Components C B · · ·| E(e, �, � ) Running program| E(x, � ) Waiting program| R(e, x, � ) Run program e on �

C`7�! C

P�N��R��x is fresh

Cstart(� )======) CR(e, x, � )

P�S�� CR(e, x, � ) ) CR(e, x, � )E(e, �, ret x )

P�R�� CE(� 0, ret x )R(e, x, � )stop(� 0)======) CS(x, � 0)

P�S��1 CE(e1 >>> e2, �, � ) ) CE(e1, �, seq e2 � )

P�S��2 CE(�, seq e � ) ) CE(e, �, � )

P�I��1x 0 is fresh

CE(invoke f , �, � ) ) CE(x 0, � )R(f , x 0, � )

P�I��2 CE(x, � )S(x, � ) ) CE(�, � )

P�F��1 CE(first e, (�1, �2), � ) ) CE(e, �1, first �2 � )

P�F��2 CE(�1, first �2 � ) ) CE((�1, �2), � )

P�D�� CE(�, � ) ) C

Figure 7. Extending � with ��.

Figure 7 extends the � with ��. This extension allowsrequests to run �� programs (R(e,x ,� )), in addition to ordi-nary requests that name serverless functions.4 �� is based

4In practice, a request would name an �� program instead of carrying theprogram itself.

8

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770


�� expressions e B · · · | p Run transformation�� values � ::=n | b | str | null�� pattern p B � �� literal

| [p1, · · · , pn ] Array| {str1 : p1, · · · , strn : pn } Object| p1 op p2 Operators| if (p1) then p2 else p3 Conditional| [str1 ! p1] Update �eld| in q Input reference

�� query q B Empty query| .[n]q Array index| .idq Field lookup

Figure 8. �� transformation language.

on Hughes’ arrows [27], thus it supports the three basic ar-row combinators. An �� program can (1) invoke a serverlessfunction (invoke f ); (2) run two subprograms in sequence(e1 >>> e2); or (3) run a subprogram on the �rst componentof a tuple, and return the second component unchanged(first e). These three operations are su�cient to describeloop- and branch-free compositions of serverless functions.It is straightforward to add support for bounded loops andbranches, which we do in our implementation.

To run �� programs, we introduce a new kind of compo-nent (E) that executes programs using an abstract machinethat is similar to a �� machine [15]. In other words, theevaluation rules de�ne a small-step semantics with an ex-plicit representation of the continuation (�). This design isnecessary because programs need to suspend execution toinvoke serverless functions (P�I��1) and then later re-sume execution (P�I��2). Similar to serverless functions,�� programs also execute at-least-once. Therefore, a sin-gle request may spawn several programs (P�S��) and aprogram may die while waiting for a serverless function toresponse (P�D��).

A sub-language for �� transformations. A problemthat arises in practice is that input and output values toserverless functions (�) are frequently formatted as ��values, which makes hard to de�ne the first operator in a sat-isfactory way. For example, we could de�ne first to operateover two-element �� arrays, and then require program-mers to write serverless functions to transform arbitrary�� into this format. However, this approach is cumbersomeand resource-intensive. For even simple transformations, theprogrammer would have to write and deploy serverless func-tions; the serverless platform would need to sandbox theprocess using heavyweight �� mechanisms; and the plat-form would have to copy values to and from the process.Instead, we augment �� with a sub-language of ��

transformations (Figure 8). This language is a superset of��. It has a distinguished variable (in) that refers to theinput �� value, which may be followed by a query to selectfragments of the input. For example, we can use this trans-formation language to write an �� program that receives

a <- invoke f(in); b <- invoke g(in);c <- invoke h({ x: b, y: a.d }); ret c;

(a) Surface syntax program.[in, { input: in }] >>> first (invoke f) >>>

[in[1].input, in[1][a -> in[0]]] >>>

first (invoke g) >>>

[{ x: in[0], y: in[1].a.d }, in[1]] >>>

first (invoke h) >>> in[0]

(b) Naive translation to ��.[in, { input: in }] >>> first (invoke f) >>>

[in[1].input, { a: in[0] }] >>> first (invoke g) >>>

[{ x: in[0], y: in[1].a.d }, {}] >>>


(c) Live variable analysis eliminates several �elds.[in, { input: in }] >>> first (invoke f) >>>

[in[1].input, { ad: in[0].d }] >>> first (invoke g) >>>

[{ x: in[0], y: in[1].ad }, {}] >>>


(d) Live key analysis immediately projects a.d.

Figure 9. Compiling the surface syntax of ��.

a two-element array as input and then runs two di�erentserverless functions on each element:first (invoke f) >>> [in[1], in[0]] >>> first (invoke g)

Without the �� transformation language, we would needan auxiliary serverless function to swap the elements.

A simpler notation for �� programs. �� is designed tobe a minimal set of primitives that are straightforward toimplement in a serverless platform. However, �� programsare di�cult for programmers to comprehend. To address thisproblem, we have also developed a surface syntax for ��that is based on Paterson’s notation for arrows [35]. Usingthe surface syntax, we can rewrite the previous example asfollows:x <- invoke f(in[0]); y <- invoke g(in[1]); ret [y, x];

This version is far less cryptic than the original.We describe the surface syntax compiler by example. At

a high level, the compiler produces �� programs in store-passing style. For example, Figure 9a shows a surface syntaxprogram that invokes three serverless functions (f , �, and h).However, the composition is non-trivial because the inputof each function is not simply the output of the previousfunction. We compile this program to an equivalent �� pro-gram that uses the �� transformation language to saveintermediate values in a dictionary (Figure 9b). However,this naive translation carries unnecessary intermediate state.We address this problem with two optimizations. First, thecompiler performs a live variable analysis, which producesthe more compact program shown in Figure 9c. In the origi-nal program, the input reference (in) is not live after �, and

9

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825


826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

c is the only live variable after h, thus these are eliminatedfrom the state. Second, the compiler performs a livenessanalysis of the �� keys returned by serverless functions,which produces an even smaller program (Figure 9d). In ourexample, f returns an object a, but the program only usesa.d and discards any other �elds that a may have. There aremany situations where the entire object a may be signi�-cantly larger than a.d , thus extracting it early can shrink theamount of state a program carries.

6.3 ImplementationOpenWhisk implementation. Apache OpenWhisk is ama-ture and widely-deployed serverless platform that is writ-ten in Scala and is the foundation of �� Cloud Functions.We have implemented �� as a 1200 �� patch to Open-Whisk, which includes the surface syntax compiler and sev-eral changes to the OpenWhisk runtime system. We inheritOpenWhisk’s fault tolerance mechanisms (e.g., at-least-onceexecution) and reuse OpenWhisk’s support for serverlessfunction sequences [7] to implement the >>> operator of ��.Our OpenWhisk implementation of �� has three di�er-

ences from the language presented so far. First, it supportsbounded loops, which are a programming convenience. Sec-ond, instead of implementing the first operator and the ��transformation language as independent expressions, wehave a single operator that performs the same action as first,but applies a �� transformation to the input and output,which is how transformations are most commonly used. Fi-nally, we implement a multi-armed conditional, which is astraightforward extension to ��. These operators allow usto compile the surface syntax to smaller �� programs, whichmoderately improves performance.

Portable implementation. We have also built a portableimplementation of �� (1156 �� of Rust) that can invokeserverless functions in public clouds. (We have tested withGoogle Cloud Functions.) Whereas the OpenWhisk imple-mentation allows us to carefully measure load and utilizationon our own hardware test-bed, we cannot perform the sameexperiments with our standalone implementation, since pub-lic clouds abstract away the hardware used to run serverlessfunctions. The portable implementation has helped us ensurethat the design of �� is independent of the design and im-plementation of OpenWhisk, and we have used it to exploreother kinds of features that a serverless platformmay wish toprovide. For example, we have added a fetch operator to ��that receives the name of a �le in cloud storage as input andproduces the �le’s contents as output. It is common to haveserverless functions fetch private �les from cloud storage(after an access control check). The fetch operator can makethese kinds of functions faster and consume fewer resources.

7 EvaluationThis section �rst evaluates the performance of �� using mi-crobenchmarks, and then highlights the expressivity of ��with three case studies. We will release our implementationsas open-source and submit our experiments to the ��.

7.1 Comparison to OpenWhisk ConductorThe Apache OpenWhisk serverless platform has built-insupport for composing serverless functions using a Conduc-tor [37]. A Conductor is special type of serverless functionthat, when invoked, may respond with the name and argu-ments of an auxiliary serverless function for the platform toinvoke, instead of returning immediately. When the auxil-iary function returns a response, the platform re-invokes theConductor with the response value. Therefore, the platforminterleaves the execution of the Conductor function and itsauxiliary functions, which allows a Conductor to implementsequential control �ow, similar to ��. The key di�erencebetween �� and a Conductor, is that �� is designed to run di-rectly on the platform, whereas the Conductor is a serverlessfunction itself, which consumes additional resources.

This section compares the performance of �� to Conduc-tor with a microbenchmark that stresses the performance ofthe serverless platform. The benchmark �� program runs asequence of ten serverless functions, and we translate it to anequivalent program for Conductor. We run two experimentsthat (1) vary the number of concurrent requests and (2) varythe size of the requests. In both experiments, we measureend-to-end response time, which is the metric that is mostrelevant to programmers. We �nd that �� outperforms Con-ductor in both experiments, which is expected because itsdesign requires fewer resources, as explained below.

We conduct our experiments on a six-core Intel Xeon E5-1650 with 64 GB RAM with Hyper-Threading enabled.

Concurrent invocations. Our �rst experiment shows howresponse times change when the system is processing sev-eral concurrent requests. We run N concurrent requests ofthe same program, and then measure �ve response times.Figure 10a shows that �� is slightly faster than Conductorwhen N 12, but approximately twice as fast when N > 12.We attribute this to the fact that Conductor interleaves theconductor function with the ten serverless functions, thus itrequires twice as many containers to run. Moreover, sinceour �� has six hyper-threaded cores, Conductor is over-loaded with 12 concurrent requests.

Request size. Our second experiment shows how responsetimes depend on the size of the input request. We use thesame microbenchmark as before, and ensure that the plat-form only processes one request at a time. We vary the re-quest body size from 0KB to 1MB and measure �ve responsetimes at each request size. Figure 10b shows that �� is al-most twice as fast as Conductor. We again attribute this to

10

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935


●●●●● ●●●●● ●●

●●● ●●

●●●

●●●●●

●●

●

●●

●●●●●

●●●●●

●

●●

●

●

●

●●●●

●●●

●●

●●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●●●

0

30

60

90

0 4 8 12 16 20 24 28

Concurrent Requests

Res

pons

e T

ime

(s)

● Conductor

SPL

(a) Response time vs. concurrent requests.

●●●●●

●

●●●

●

●●

●●

●

●

●●●●

●●●●●

●●●●●

●●●

●

●●●

●●● ●

●●●●

●

●

●●

●

●

●

●●

●

●●●●●

●●

●●●

●●●

●

● ●●●●●

●●●●●

●●●●● ●

●●●●

●

●●●●

●●●●●

0.5

1.0

1.5

025

050

075

0

Request Size (KB)

Res

pons

e T

ime

(s)

● Conductor

SPL

(b) Response time vs. request size.

●● ●

●●● ●●●●●●●●●●

●●●

●● ● ●●

●● ● ● ● ●● ● ● ●● ● ●●0

10

20

30

40

0 1 2 3 5 9 15 25 41 67 108

174

281

456

73711

8619

0430

7949

92

Size of CSV (KB)

Run

tim

e (s

)

Overhead

Total

(c) Response time and overhead.

Figure 10. �� benchmarks. Figures 10a and 10b show that �� is faster than OpenWhisk Conductor when the serverlessplatform is processing several concurrent requests, or when the request size increases. Figure 10c shows that most of theexecution time is spent in serverless functions, and not running ��.

the fact that Conductor needs to copy the request across asequence of functions that is twice as long as ��.

Summary. The ��-based isolation techniques that Conduc-tor uses have a nontrivial cost. ��, since it uses language-based isolation, is able to lower the resource utilization ofserverless compositions by up to a factor of two.

7.2 Case StudiesThe core �� language, presented in §6, is a minimal fragmentthat lacks convenient features that are needed for real-worldprogramming. To identity the additional features that arenecessary, we have written several di�erent kinds of ��programs, and added new features to �� when necessary.Fortunately, these new features easily �t the structure estab-lished by the core language. This section presents some ofthe programs that we’ve built and discusses the new featuresthat they employ. These examples illustrate that it is easy togrow core �� into a convenient and expressive programminglanguage for serverless function composition.

Conditional bank deposits. Figure 11a uses �� to rewritethe bank deposit function (Figure 6). The original programsu�ered the “double billing” problem, since the compositeserverless function needs to be active while waiting for thebank serverless function to do actual work. In contrast, theserverless platform suspends the �� program when it in-vokes a serverless function and resumes it when the responseis ready. This example also shows that our implementationsupports basic arithmetic, which we add to the �� trans-formation sub-language in a straightforward way.

Continuous integration. Baldini et al. [7] present a com-pelling example of serverless composition, which we adaptfor ��. Our �� program (Figure 11b) connects GitHub, Slack,and Google Cloud Build (which is a continuous integrationtool, similar to TravisCI). Cloud Build makes it easy to runtests when a new commit is pushed to GitHub. But, it is muchharder to see the test results in a convenient way. However,

1 if (in.amount > 100) {

2 invoke bank({ type: �deposit�, to: �checking�,

3 amount: in.amount - 100, transId: in.tId1 });

4 invoke bank({ type: �deposit�, to: �savings�,

5 amount: 100, transId: in.tId2 });

6 } else {

7 invoke bank({ type: �deposit�, to: �checking�,

8 amount: in.amount, transId: in.tId1 });

9 } ret;

(a) Receive funds, deposit $100 in savings (if feasible), and depositthe rest in checking.1 invoke postStatusToGitHub({ state: in.state,2 sha: in.sha, url: in.url, repo: �<repo owner/name>� });

3 if (in.state == �failure�) {

4 invoke postToSlack({ channel: �<id>�, text: �<msg>� });

5 } ret;

(b) Receive build state from Google Cloud Build, set status onGitHub, and report failures on Slack.1 data <- get in.url;2 json <- invoke csvToJson(data);

3 out <- invoke plotJson({data:json,x:in.xAxis,y:in.yAxis});4 ret out;

(c) Receive a �� of a �� le and two column names, downloadthe �le, convert it to ��, then plot the ��.

Figure 11. Example �� programs.

Cloud Build can invoke a serverless function when testscomplete, and we use it run an �� program that (1) uses theGitHub �� to add a test-status icon next to each commitmessage, and (2) uses the Slack �� to post a message when-ever a test fails. However, instead of writing a monolithicserverless function, we �rst write two serverless functionsthat post to Slack and set GitHub status icons respectively,and let the �� program invoke them. It is easy to reuse theGitHub and Slack functions in other applications.

Data visualization. Our last example (Figure 11c) receivesthe �� of a ��-formatted data �le with two columns, plots

11

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990


991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

the data, and responds with the plotted image. This is thekind of task that a power-constrained mobile applicationmay wish to o�oad to a serverless platform, especially whenthe size of the data is signi�cantly larger than the plot. Ourprogram invokes two independent serverless functions thatare trivial wrappers around popular JavaScript libraries: csvj-son, which converts �� to �� and vega, which createsplots from �� data. This example uses a new primitive(get) that our implementation supports to download the data�le. Downloading is a common operation that is natural forthe platform to provide. Our implementation simply issuesan �� request and suspends the �� program until theresponse is available. However, it is easy to imagine moresophisticated implementations that support caching, autho-rization, and other features. Finally, this example show thatour �� implementation is not limited to processing ��.The get command produces a plain-text �� le, and theplotJson invocation produces a �� image.A natural question to ask about this example is whether

the decomposition into three parts introduces excessive com-munication overhead. We investigate this by varying thesize of the input ��s (ten trials per size), and measuringthe total running time and the overhead, which we de�neas the time spent outside serverless functions (i.e., transfer-ring data, running get, and applying �� transformations).Figure 10c shows that even as the �le size approaches 5 MB,the overhead remains low (less than 3% for a 5 MB �le, andup to 25% for 1 KB �le).

8 Related WorkServerless computing. Baldini et al. [7] introduce the prob-lem of serverless function composition and present a newprimitive for serverless function sequencing. Subsequentwork develops serverless state machines (Conductor) and a�� (Composer) thatmakes statemachines easier towrite [37].In §6, we present an alternative set of composition opera-tors that we formalize as an extension to � , implement inOpenWhisk, and evaluate their performance.

Trapeze [2] presents dynamic �� for serverless computing,and further sandboxes serverless functions to mediate theirinteractions with shared storage. Their Coq formalizationof termination-sensitive noninterference does not modelseveral features of serverless platforms, such as warm startsand failures, that our semantics does model.Several projects exploit serverless computing for elastic

parallelization [4, 17, 18, 28]. §6 addresses modularity anddoes not support parallel execution. However, it would bean interesting challenge to grow the �� in §6 to the pointwhere it can support the aforementioned applications with-out any non-serverless computing components. It is worthnoting that today’s serverless execution model is not a good�t for all applications. For example, Singhvi et al. [40] list

several barriers to running network functions on serverlessplatforms.

Serverless computing and other container-based platformssu�er several performance and utilization barriers. Thereare several ways to address these problems, including data-center design [20], resource allocation [9], programming ab-stractions [7, 37], and edge computing [5]. � is designed toelucidate subtle semantic issues (not performance problems)that a�ect programmers building serverless applications.

Language-based approaches to microservices. �FAIL [38]is a semantics for horizontally-scaled services with durablestorage, which are related to serverless computing. A keydi�erence between � and �FAIL is that � models warm-starts, which occur when a serverless platform runs a newrequest on an old function instance, without resetting itsstate. Warm-starts make it hard to reason about correctness,but this paper presents an approach to do so. Both � and�FAIL present weak bisimulations between detailed and naivesemantics. However, � ’s naive semantics processes a sin-gle request at a time, whereas �FAIL’s idealized semanticshas concurrency. We use � to specify a protocol to ensurethat serverless functions are idempotent and fault tolerant.However, �FAIL also presents a compiler that automaticallyensures that these properties hold for C# and F# code. Webelieve the approach would work for � . §6 extends � withnew primitives, which we then implement and evaluate.

Whip [41] and ucheck [33] are tools that check propertiesof microservice-based applications at run-time. These worksare complementary to ours. For example, our paper identi�esseveral important properties of serverless functions, whichcould then be checked using Whip or ucheck.

Cloud orchestration frameworks. Ballerina [42] is a lan-guage for managing cloud environments; Engage [16] is adeployment manager that supports inter-machine depen-dencies; and Pulumi [36] is an embedded �� for writingprograms that con�gure and run in the cloud. In contrast,� is a semantics of serverless computing. §6 uses � to de-sign and implement a language for composing serverlessfunctions that runs within a serverless platform.

Veri�cation. There is a large body of work on veri�cation,testing, and modular programming for distributed systemsand algorithms (e.g., [6, 10, 12, 13, 21, 24, 25, 39, 43]). Theserverless computation model is more constrained than arbi-trary distributed systems and algorithms. This paper presentsa formal semantics of serverless computing, � , with an em-phasis on low-level details that are observable by programs,and thus hard for programmers to get right. To demonstratethat � is useful, we present three applications that employit and extend it in several ways. This paper does not addressveri�cation for serverless computing, but � could be usedas a foundation for future veri�cation work.

12

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100


9 ConclusionWe have presented � , an operational semantics that modelsthe low-level of serverless platforms that are observable byprogrammers. We have also presented three applications of� . (1) We prove a weak bisimulation to characterize whenprogrammers can ignore the low-level details of � . (2) Weextend � with a key-value store to reason about statefulfunctions. (3) We extend � with a language for serverlessfunction composition, implement it, and evaluate its per-formance. We hope that these applications show that �can be a foundation for further research on language-basedapproaches to serverless computing.

References[1] A��, I. E., C��, R., R��, I., S��, M., S��, K., B��, A.,

A��, P., �� H��, V. SAND: Towards high-performance serverlesscomputing. In USENIX Annual Technical Conference (ATC) (2018).

[2] A��, K., F��, C., F��, S., R��, L., S��, M.,S��, T., ��W��, K. Secure serverless computing using dy-namic information �ow control. Proc. ACM Program. Lang. 2, OOPSLA(Oct. 2018).

[3] AWS Lambda developer guide: Invoke. h�ps://docs.aws.amazon.com/lambda/latest/dg/API_Invoke.html. Accessed Jul 5 2018, 2018.

[4] A�, L., I��, L., V��, G. M., �� P��, G. Sprocket: Aserverless video processing framework. In Proceedings of the ACMSymposium on Cloud Computing (2018), SoCC ’18.

[5] A��, A., �� Z��, X. Supporting multi-provider serverless comput-ing on the edge. In Proceedings of the 47th International Conference onParallel Processing Companion (2018), ICPP ’18.

[6] B��, A., G��, K. �., K, R. G., �� J��, R. Verifyingdistributed programs via canonical sequentialization. InACM SIGPLANConference on Object Oriented Programming, Systems, Languages andApplications (OOPSLA) (2017).

[7] B��, I., C��, P., F��, S. J., M��, N., M��, V.,R��, R., S��, P., �� T��, O. The serverless trilemma:Function composition for serverless computing. In ACM SIGPLANInternational Symposium on New Ideas, New Paradigms, and Re�ectionson Programming and Software (Onward!) (2017).

[8] B��, G., D��, J., D��, E., H��, M., �� S��,A. Protecting chatbots from toxic content. In 2018 ACM SIGPLANInternational Symposium on New Ideas, New Paradigms, and Re�ectionson Programming and Software (2018).

[9] B��, M., B��, R., �� B��, W. Resource managementof replicated service systems provisioned in the cloud. In NetworkOperations and Management Symposium (NOMS) (2016).

[10] C��, T., K��, F., L��, B., �� Z��, N. Verifyingconcurrent software using movers in CSPEC. In 13th USENIX Sym-posium on Operating Systems Design and Implementation (OSDI 18)(2018).

[11] C��, S. Cloud native technologies are scaling pro-duction applications. h�ps://www.cncf.io/blog/2017/12/06/cloud-native-technologies-scaling-production-applications/,2017. Accessed Jul 12 2018.

[12] D��, A., P��, A., Q��, S., �� S��, S. A. Composi-tional programming and testing of dynamic distributed systems. InACM SIGPLAN Conference on Object Oriented Programming, Systems,Languages and Applications (OOPSLA) (2018).

[13] D��, C., H��, T. A., �� Z��, D. Psync: A partiallysynchronous language for fault-tolerant distributed algorithms. InACM SIGPLAN-SIGACT Symposium on Principles of Programming Lan-guages (POPL) (2016).

[14] E��, A. OpenFaaS. h�ps://www.openfaas.com. Accessed Jul 5 2018.

[15] F��, M., �� F��, D. P. Control operators, the SECD-machine, and the �-calculus. In Proceedings of the IFIP TC 2/WG 2.2Working Conference on Formal Description of Programming Concepts(1986).

[16] F��, J., M��, R., �� E��, S. Engage: A deploy-ment management system. In ACM SIGPLAN Conference on Program-ming Language Design and Implementation (PLDI) (2012).

[17] F��, S., I��, D., C��, S., K��, C., Z��, M.,�� W��, K. A thunk to remember: make-j1000 (and other jobs)on functions-as-a-service infrastructure, 2017.

[18] F��, S., W��, R. S., S��, B., B��, K. V.,Z��, W., B��, R., S��, A., P��, G., �� W��,K. Encoding, fast and slow: Low-latency video processing using thou-sands of tiny threads. In USENIX Symposium on Networked SystemDesign and Implementation (NSDI) (2017).

[19] F�� S��, J., M��, P., N��̄��̇, D., W��, T.,��G��, P. JaVerT: JavaScript veri�cation toolchain. Proceedingsof the ACM on Programming Languages 2, POPL (2018), 50:1–50:33.

[20] G��, Y., �� D��, C. The Architectural Implications of CloudMicroservices. In Computer Architecture Letters (CAL), vol.17, iss. 2(Jul-Dec 2018).

[21] G��, V. B. F., K��, M., M��, D. P., �� B��,A. R. Verifying strong eventual consistency in distributed systems. InACM SIGPLAN Conference on Object Oriented Programming, Systems,Languages and Applications (OOPSLA) (2017).

[22] Google Cloud Functions. h�ps://cloud.google.com/functions/. Ac-cessed Jul 5 2018.

[23] Cloud Functions execution environment. h�ps://cloud.google.com/functions/docs/concepts/exec. Accessed Jul 5 2018, 2018.

[24] G��, A., R��, M., �� F��, N. Machine veri�ed networkcontrollers. In ACM SIGPLAN Conference on Programming LanguageDesign and Implementation (PLDI) (2013).

[25] H��, C., H��, J., K��, M., L��, J. R., P��, B.,R��, M. L., S��, S., �� Z��, B. Iron�eet: Proving practicaldistributed systems correct. In ACM Symposium on Operating SystemsPrinciples (SOSP) (2015).

[26] H��, S., S��, S., H��, T., V��, V.,A��D��, A. C., �� A��D��, R. H. Serverless com-putation with OpenLambda. In USENIX Workshop on Hot Topics inCloud Computing (HotCloud) (2016).

[27] H��, J. Generalising monads to arrows. Science of ComputerProgramming 37, 1–3 (May 2000), 67–111.

[28] J��, E., P�, Q., V��, S., S��, I., �� R��, B. Occupythe cloud: Distributed computing for the 99%. In Symposium on CloudComputing (2017).

[29] Microsoft Azure Functions. h�ps://azure.microso�.com/en-us/services/functions/. Accessed Jul 5 2018.

[30] Choose between Azure services that deliver messages. h�ps://docs.microso�.com/en-us/azure/event-grid/compare-messaging-services.Accessed Jul 5 2018, 2018.

[31] Apache OpenWhisk. h�ps://openwhisk.apache.org. Accessed Jul 52018.

[32] OpenWhisk actions. h�ps://github.com/apache/incubator-openwhisk/blob/master/docs/actions.md. AccessedJul 5 2018, 2018.

[33] P��, A., S��, M., �� S��, S. Veri�cation in the age ofmicroservices. In Workshop on Hot Topics in Operating Systems (2017).

[34] P��, D., S��, A., �� R��, G. KJS: A complete formalsemantics of JavaScript. In ACM SIGPLAN Conference on ProgrammingLanguage Design and Implementation (PLDI) (2015).

[35] P��, R. A new notation for arrows. In ACM International Con-ference on Functional Programming (ICFP) (2001).

[36] Pulumi. cloud native infrastructure as code. h�ps://www.pulumi.com/.Accessed Jul 5 2018, 2018.

13

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155


1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

[37] R��, R. Composing functions into applicationsthe serverless way. h�ps://medium.com/openwhisk/composing-functions-into-applications-70d3200d0fac. Accessed Jul 52018, 2017.

[38] R��, G., �� V��, K. Fault tolerance via idempotence.In ACM SIGPLAN-SIGACT Symposium on Principles of ProgrammingLanguages (POPL) (2013).

[39] S��, I., W��, J. R., �� T��, Z. Programming and provingwith distributed protocols. In ACM SIGPLAN-SIGACT Symposium onPrinciples of Programming Languages (POPL) (2017).

[40] S��, A., B��, S., H��, Y., A��, A., P��, M., ��R��, P. Granular computing and network intensive applications:

Friends or foes? In Proceedings of the 16th ACMWorkshop on Hot Topicsin Networks (2017), HotNets-XVI.

[41] W��, L., C��, S., �� D��, C. Whip: Higher-order contractsfor modern services. In ACM International Conference on FunctionalProgramming (ICFP) (2017).

[42] W��, S., E��, C., P��, S., �� L��, F. Bring-ing middleware to everyday programmers with ballerina. In BusinessProcess Management (2018).

[43] W��, J. R., W��, D., P��, P., T��, Z., W��, X., E��,M. D., �� A��, T. Verdi: A framework for implementing andformally verifying distributed systems. In ACM SIGPLAN Conferenceon Programming Language Design and Implementation (PLDI) (2015).

14

formal foundations of serverless computing · 2020-02-27 · formal foundations of serverless...

Documents