building reliable, high-performance networks with …...building reliable, high-performance networks...

31
Under consideration for publication in J. Functional Programming 1 Building Reliable, High-Performance Networks with the Nuprl Proof Development System Christoph Kreitz Department of Computer Science, Cornell-University Ithaca, NY 14853-7501, USA (e-mail: [email protected]) Abstract Proof systems for expressive type theories provide a foundation for the verification and synthesis of programs. But despite their successful application to numerous programming problems there remains an issue with scalability. Are proof environments capable of rea- soning about large software systems? Can the support they offer be useful in practice? We answer this question by showing how the Nuprl proof development system and its rich type theory have contributed to the design of reliable, high-performance networks by synthesizing optimized code for application configurations of the Ensemble group communication toolkit. In this article we present a type-theoretical semantics of OCaml, the implementation language of Ensemble, and tools for automatically importing system code into the Nuprl system. We describe reasoning strategies for generating verifiably correct fast-path opti- mizations of application configurations that substantially reduce end-to-end latency in Ensemble. We also discuss briefly how to use Nuprl for checking configurations against specifications and for the design of reliable adaptive network protocols. 1 Introduction Advanced type systems have greatly increased our ability to produce reliable soft- ware. In programming languages, type checking, certifying compilers, and extended static checking (Schneider et al., 2000; Leino, 2000) help detecting subtle errors at compile time. In theorem proving, type theories provide the logical foundation for proving programs correct and synthesizing algorithms from formal specifications. Numerous proof assistants for have been built for these expressive formalisms and been used successfully in a variety of applications in mathematics and program- ming. Some of the most prominent of these systems are Alf (Altenkirch et al., 1994; ALFA, n.d.), Coq (Dowek & et. al, 1991; Coq, n.d.), HOL (Gordon & Mel- ham, 1993; HOL, n.d.), Isabelle (Paulson, 1990; Isabelle, n.d.), Lego (Pollack, 1994), Nuprl (Constable et al., 1986; Allen et al., 2000; NuPRL, n.d.), PVS (Owre et al., 1996; PVS, n.d.), TPS (Andrews et al., 1996; TPS, n.d.), and Twelf (Pfen- ning & Sch¨ urmann, 1999). Most applications, however, have dealt only with theoretical algorithms or ide- alizations of real systems. It is not clear whether the proof methods used in them scale well enough to offer practically useful support for software design.

Upload: others

Post on 01-Jun-2020

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Under consideration for publication in J. Functional Programming 1

Building Reliable, High-Performance Networkswith the Nuprl Proof Development System

Christoph KreitzDepartment of Computer Science, Cornell-University

Ithaca, NY 14853-7501, USA(e-mail: [email protected])

Abstract

Proof systems for expressive type theories provide a foundation for the verification andsynthesis of programs. But despite their successful application to numerous programmingproblems there remains an issue with scalability. Are proof environments capable of rea-soning about large software systems? Can the support they offer be useful in practice?

We answer this question by showing how the Nuprl proof development system andits rich type theory have contributed to the design of reliable, high-performance networksby synthesizing optimized code for application configurations of the Ensemble groupcommunication toolkit.

In this article we present a type-theoretical semantics of OCaml, the implementationlanguage of Ensemble, and tools for automatically importing system code into the Nuprlsystem. We describe reasoning strategies for generating verifiably correct fast-path opti-mizations of application configurations that substantially reduce end-to-end latency inEnsemble. We also discuss briefly how to use Nuprl for checking configurations againstspecifications and for the design of reliable adaptive network protocols.

1 Introduction

Advanced type systems have greatly increased our ability to produce reliable soft-ware. In programming languages, type checking, certifying compilers, and extendedstatic checking (Schneider et al., 2000; Leino, 2000) help detecting subtle errors atcompile time. In theorem proving, type theories provide the logical foundation forproving programs correct and synthesizing algorithms from formal specifications.Numerous proof assistants for have been built for these expressive formalisms andbeen used successfully in a variety of applications in mathematics and program-ming. Some of the most prominent of these systems are Alf (Altenkirch et al.,1994; ALFA, n.d.), Coq (Dowek & et. al, 1991; Coq, n.d.), HOL (Gordon & Mel-ham, 1993; HOL, n.d.), Isabelle (Paulson, 1990; Isabelle, n.d.), Lego (Pollack,1994), Nuprl (Constable et al., 1986; Allen et al., 2000; NuPRL, n.d.), PVS (Owreet al., 1996; PVS, n.d.), TPS (Andrews et al., 1996; TPS, n.d.), and Twelf (Pfen-ning & Schurmann, 1999).

Most applications, however, have dealt only with theoretical algorithms or ide-alizations of real systems. It is not clear whether the proof methods used in themscale well enough to offer practically useful support for software design.

Page 2: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

2 Christoph Kreitz

ENSEMBLE

RECONFIGUREDFAST & SECURE

of

ENSEMBLE

SIMULATED

Programming EnvironmentOCaml

Deductive SystemNuPRL / TYPE THEORY

PROOF

OPTIMIZE TRANSFORM

EXPORTENSEMBLE

PROOF

RECONFIGURATION

IMPORTENSEMBLE VERIFY

SPECIFICATION

Fig. 1. Linking Ensemble and Nuprl

In this article we will demonstrate that formal logical methods based on expres-sive type systems are capable of supporting the formal design and implementationof large-scale, high-performance network systems. In particular we will show thatlinking the Ensemble group communication toolkit (Hayden, 1998; Birman et al.,2000) to the Nuprl proof development system (Constable et al., 1986; Allen et al.,2000) provides an infrastructure for the application of logical inference techniques toa real-world system. We call this infrastructure a logical programming environmentfor Ensemble (Kreitz et al., 1998). Within the logical programming environmentwe have developed logical optimization tools that can substantially increase theperformance of Ensemble and as well as mechanisms that support the verificationand formal design of Ensemble protocols.

Figure 1 illustrates the methodology of our approach. We link Ensemble andNuprl by developing tools for importing Ensemble’s system code into the Nuprl

proof development system and vice versa. These tools convert Ensemble code intoterms of Nuprl’s logical language and are based on a type-theoretical semantics ofEnsemble’s implementation language OCaml,

Within Nuprl we then optimize the (represented) code of application configu-rations of Ensemble by applying semantics-preserving logical transformations andexport the result back into the OCaml programming environment. In addition tothat we prove that the generated code, while being significantly more efficient, hasthe same functionality as to the original one. We can also apply formal reasoningstrategies to verify Ensemble protocols and use formal techniques to support thedesign new communication protocols for Ensemble.

In the following section we will give a brief account of Nuprl and Ensemble.Section 3 will describe the type-theoretical semantics of OCaml as well as thetools for importing and exporting system code. Formal optimization is presented inSection 4, while Section 5 describes research on the verification and formal design ofEnsemble protocols. Section 6 addresses related work. We conclude by discussinginsights gained from our research as well as future work.

Page 3: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 3

2 Preliminaries

2.1 Nuprl

The Nuprl proof development system (Constable et al., 1986) is a framework forthe development of formal mathematical knowledge as well as for the synthesis,verification, and optimization of software.

Nuprl’s logical language is a significant extension of Martin-Lof’s intuitionisticType Theory (Martin-Lof, 1984; Constable, 1998) that includes formalizations offundamental mathematical concepts, an expressive data system, and a functionalprogramming language similar to the core of ML. Nuprl’s type system is open-ended in the sense that new types may be added to the theory if needed. Recentadditions for reasoning about classes and objects include very dependent functiontypes, dependent intersection and records, and a union type (Hickey, 1996; Kopylov,2000; Constable & Hickey, 2000; Hickey, 2001).

As Nuprl expressions are defined independently of their types, Y combinatorsand similar constructs may be used as parts of Nuprl terms, which makes it pos-sible to represent all computable functions in type theory. In the course of a formalargument, however, expressions have to proven to belong to some type. Nuprl

proofs are developed in a top-down sequent calculus: proof goals are refined intosubgoals by application of inference rules until they can be handled by axioms oralready proven lemmata.

The Nuprl system (Constable et al., 1986; Allen et al., 2000; NuPRL, n.d.)supports an interactive development of formal mathematical theories and programverifications. It provides a highly visual proof editor , a tactic mechanism for thedevelopment of proof strategies through programmed application of inference rules,decision procedures for standard arithmetic and equality reasoning, mechanisms forextracting programs from proofs and evaluating programs, and an extendable libraryof verified knowledge from various domains. Furthermore, users may use definitionalabstractions to extend the formal language of type theory and display forms tocustomize the outer appearance of terms without changing their internal structure.

The system has been used in increasingly large applications in mathematics andprogramming, such as constructive versions of Girard’s paradox (Howe, 1987), Hig-man’s lemma (Murthy & Russell, 1990), and abstract algebra (Jackson, 1994),verifications of a logic synthesis tool (Aagaard & Leeser, 1993) and of the SCIcache coherency protocol (Howe, 1996), and in our current work on communicationsystems (Kreitz et al., 1998; Hickey et al., 1999; Liu et al., 1999; Bickford et al.,2001c).

In its the newest release (Allen et al., 2000), Nuprl features an open, distributedarchitecture that is organized as a collection of independent communicating pro-cesses centered around a persistent knowledge base (or library). This enables usersto connect external proof tools to Nuprl and to use them simultaneously andasynchronously in complex proofs. Currently, users may invoke the MetaPRL proofengine (Hickey & Nogin, 2000; Hickey, 2001; MetaPRL, n.d.) and the (intuitionistic)first-order theorem prover JProver (Schmitt et al., 2001). Additional proof systemswill be connected in the future.

Page 4: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

4 Christoph Kreitz

2.2 Ensemble

Ensemble (Hayden, 1998) is a high-performance network

Total

Frag

Membership

Network

Top

applicationEnsemble

Fig. 2. Ensemblearchitecture

protocol architecture that aims at securing critical applica-tions. It is a successor of the widely adopted Isis (Birman& van Renesse, 1994) and Horus (van Renesse et al., 1996)systems and designed particularly to support group member-ship and communication protocols. Ensemble is currentlyused in the BBN Aqua and Quo platforms, a fault-toleranttestbed at JPL, the Adapt adaptive multimedia middlewaresystem at Lancaster University, a multiplayer game by Sega-soft, and in the Alier financial database tools.

Ensemble’s architecture is based on the notion of a pro-tocol stack . The system is constructed from a library of oversixty micro-protocol modules, or layers, which implementfragmentation and re-assembly, flow control, message order-ing, buffering and retransmission, signing and encryption,group membership, synchronization, and other functionality.Micro-protocols can be stacked in a variety of ways to meetthe communication demands of an applications. Each mod-ule adheres to a common interface consisting of a top-level and a bottom-level part.The top-level interface of a module communicates with the bottom-level interfaceof the module immediately on top of it. The interface is event-driven: modules passevent objects to the adjacent modules. Certain types of events (e.g. send events)travel down, while others (e.g. message delivery events) travel up the stack.

Ensemble is written almost entirely in OCaml (Leroy, 2000). The choice ofOCaml instead of C, which was used for implementing Ensemble’s predecessorHorus, significantly reduced the size of micro-protocol code and made it easier todevelop and maintain Ensemble modules. The main benefit of OCaml, however,is that it has a precise mathematical semantics, which can be employed for the au-tomated verification, optimization, and generation of Ensemble protocol stacks.These operations can be tedious and error-prone if performed manually. Using theNuprl proof development system enables us to address both problems: automa-tion makes it possible to re-use common reasoning strategies, and formal reasoningguarantees the correctness of code transformations.

3 Formalizing the Semantics of OCaml

OCaml (Leroy, 2000), is a strongly typed, (almost) functional programming lan-guage, which has been extended with a module system and an object calculus. Itsfunctional core is similar to the language of Nuprl’s type theory. But it has adifferent, less rigid syntax and contains many additional features.

To support formal reasoning about systems that are implemented in OCaml wehave to be able to automatically convert OCaml programs into terms of of Nuprl’stype theory that capture the semantics of these programs. For this purpose wehave developed a shallow type-theoretical semantics of OCaml that is faithful with

Page 5: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 5

respect to the informal semantics given in the OCaml manual (Leroy, 2000) andto the operational semantics generated by the OCaml compiler.

We have “implemented” this formalization using Nuprl’s definition mechanism:definitional abstractions add new terms to Nuprl’s type theory that capture theoperational semantics of an OCaml language construct, while display forms makesure that the outer appearance of these terms is identical to the OCaml constructthey represent. An OCaml function declaration, for instance, is represented by thefollowing two objects.

ABS Function{}(.p; .e) ≡ λs,env. inr(λe1. p e1 e), s

DISP function p -> e ≡ Function{}(.p; .e)

The abstraction declares a new term Function with (no parameters and) twosubterm arguments p and e and defines it to be equal to the term on the right handside. The corresponding display form describes that the new term is to be displayedas function p -> e . Together, abstraction and display form represent both thesemantics and the syntax of the OCaml function declaration. In the rest of thispaper we will omit the abstract description of new terms and simply write

function p -> e ≡ λs,env. inr(λe1. p e1 e), s

In addition to the formal representation we have also developed a formal pro-gramming logic for reasoning about OCaml programs and their evaluation. Theprogramming logic is expressed in the form of inference rules that are implementedas Nuprl tactics and based on well-formedness theorems that we prove for eachlanguage construct. This guarantees that the rules are correct with respect to thetype-theoretical semantics of OCaml and allows formal reasoning to be performedat the level of OCaml programs instead of the underlying type-theoretical concepts.

Finally, we have created tools that convert OCaml programs into their formalNuprl representations and store them as objects of Nuprl’s library. These toolsare necessary to make the actual OCaml-code of an Ensemble protocol stackavailable for formal reasoning within Nuprl and to keep track of modifications inEnsemble‘s implementation.

In the rest of this section we briefly describe the formalization of OCaml and thetools that translate between OCaml programs and their formal representations.

3.1 A Formal Model for Core OCaml

The basic language of OCaml is centered around a functional core enhanced bypatterns, references, and exceptions. We will briefly discuss aspects of formalizingthese components and then describe a type-theoretical model for OCaml.

OCaml’s functional core is a simple applicative language with constants, higher-order functions, local bindings, call-by-value application, and recursive definition.Its expressions and values are

(Expressions) e ::= v | e1 e2 | let x=e1 in e2 | let rec x=e1 in e2

(Values) v ::= c | x | function x -> e

where x is a variable and c a constant.

Page 6: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

6 Christoph Kreitz

Apart from notational differences this language is very similar to the core ofNuprl’s type theory. OCaml’s function declaration function x -> e has thesame semantics as the lambda-abstraction λx.e in type theory, function applica-tion in OCaml and Nuprl are the same, a let-binding let x=e1 in e2 can beexpressed as application of an abstraction (λx.e1) e2 , and a recursive definitionlet rec x=e1 in e2 as (λx. Y (λx.e1) in e2 . OCaml variables are representedas Nuprl variables while the representation of constants depends on the represen-tation of the corresponding data types.

Patterns are language constructs that enable a programmer to decompose datastructures in a convenient way. A local binding of the form let p=e1 in e2 matchesthe pattern p against the expression e1 and binds the variables of e2 that occur in p

accordingly. In contrast to a simple let-binding, the pattern p may contain severalvariables that are grouped by data structure constructors, which determine thefragment of e1 that will be bound to a variable in e2.

As Nuprl’s type theory does not include general pattern matching, OCaml func-tion declarations and let-bindings cannot be mapped directly onto correspondingNuprl constructs anymore. Instead, we will explicitly represent OCaml’s runtimeenvironment of variable bindings and separate expressions from patterns. OCaml

expressions are functions that evaluate variables and terms in the environmentwhile patterns are functions that update the environment of some expression.

Formally, an environment env is represented as a table consisting of pairs ofvariables and values. Table lookup (env[x]) and extending the table by a newbinding (env@{x 7→v}) are defined via predefined standard list and pair operations.

{} ≡ []env@{x 7→v} ≡ <x,v> :: envenv[x] ≡ lookup x env

An OCaml variable expression x will be represented by a function that expects anenvironment env as input and looks up the current value of x in that environment.In contrast, a variable pattern xp will be represented by a function that takes twoexpressions e1 and e2 and modifies an environment env for evaluating e2 by bindingxp to the value of e1 in env. In this framework, a let binding let p=e1 in e2 isa simple application of the pattern p to the expressions e1 and e2, while a functiondefinition function p -> e applies p to some input expression e1 and to e.

Reference cells in OCaml are special instances of mutable records, which areassociated with generalizations of the familiar operations ref e , !e , and e1:=e2.

As they may occur in arbitrary OCaml expressions, causing them to have side-effects, we need to extend our formal model by explicitly representing a global state.

We represent OCaml expressions by functions that operate on an environmentenv and a store s, evaluate a term with respect to env and s, and return both thevalue and the (possibly) updated store. Similar to an environment, a store s is atable of pairs of addresses (natural numbers) and values. The creation of new tableentries (New(s)), store lookup (!s[addr], and updating the contents of a store cell(s[addr←v] are defined via standard list and pair operations.

Page 7: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 7

[ ] ≡ []New(s) ≡ max (map fst s) + 1s[addr←v] ≡ <addr,v> :: s!s[addr] ≡ lookup addr s

A reference ref e evaluates the expression e with respect to the current storeand environment, determines a free address in the updated store s1, assigns thevalue v of x to that address, and then returns the address and the modified stores1[addr←v] as result. Dereferencing !e means evaluating e and then looking upthe resulting address in the (updated) store (!s1[addr]). An assignment e1:=e2

first evaluates e2 and then e1. Afterwards it assigns the value of e2 to the storeaddress determined by e1. The result of the expression is the unit value ().

Exceptions are an ML concept for handling failure while ensuring typability ofexpressions. Exceptions may be raised by runtime failures or by explicit calls to thefunction raise and handled using the expression try e1 with p -> e2 .

Exceptions can be viewed as alternative to the value of an expression and belongto the same type. The exception Division by zero generated by evaluating x/0,

for instance, has the type int. Formally, OCaml data types are a disjoint unionEXCEPTION + T of the type of exceptions and the type T of the data type’s values.Values v and exceptions exn have to be tagged accordingly. For this purpose we usethe Nuprl terms inr(v) and inl(exn) .

We represent raise exn as a function that for each environment env and store sreturns inl(exn) and an unchanged store. try e1 with p -> e2 is representedby a function that returns the value of e1 unless the evaluation fails. Otherwiseit matches the resulting exception exn is against the pattern p and proceeds withevaluating e2 in an enriched environment.

As exceptions may occur in arbitrary OCaml expressions, our representationof expressions and patterns must always check whether the result of evaluating anexpression is a value or an exception. To simplify the formalization, we introduce anabbreviation let↓ inr(v),s = eval1 in eval2 that catches this failure check.

let↓ <inr(v),s> = eval1 in eval2

≡ let <r,s1> = eval1 incase r of inl(exn) => inl(exn), s

| inr(v) => eval2

where let <x1,x2> = e1 in e2 is Nuprl’s local decomposition of pairs.

The formal model. The combination of the functional core, patterns, references,and exceptions describes all the essential features of the Caml part of OCaml,i.e. the language fragment without the object and module system. The expressions,values, and patterns of this core language are

(Expressions) e ::= v | e1 e2 | let p=e1 in e2 | let rec p=e1 in e2

(Values) v ::= c | x | function p -> e | ref | ! | :=| raise | try e1 with p -> e2

(Patterns) p ::= xp

where x is a variable, xp a variable pattern, and c a constant.

Page 8: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

8 Christoph Kreitz

e1 e2 ≡ λs,env. let↓ <inr(v),s1> = e2 s env inlet↓ <inr(f),s2> = e1 s1 env in

(f dve) s2 env

let p = e1 in e2 ≡ p e1 e2

let rec p = e1 in e2 ≡ p (Y (λe. p e e1)) e2

xv ≡ λs,env. inr(env[xv]),s

function p -> e ≡ λs,env. inr(λe1. p e1 e), s

ref e ≡ λs,env. let↓ <inr(v),s1> = e s env in

let addr = NEW(s1) ininr(addr), s1[addr←v]

!e ≡ λs,env. let↓ <inr(addr),s1> = e s env ininr(!s1[addr]), s1

e1 := e2 ≡ λs,env. let↓ <inr(v),s1> = e2 s env inlet↓ <inr(addr),s2> = e1 s1 env in

inr (), s2[addr←v]

raise exn ≡ λs,env. inl(exn), s

try e1 with p -> e2 ≡ λs,env. let <r,s1> = e1 s env in

case r of inl(exn) => (p dexne e2) s1 env| inr(v) => inr(v), s1

xp ≡ λe1,e2.λs,env. let↓ <inr(v),s1> = e1 s env ine2 s1 (env@{xp 7→v})

Fig. 3. Nuprl representation of core OCaml

The type-theoretical formalization of this language is described by the Nuprl def-initions given in Figure 3. It represents OCaml expressions as elements of the type

EXPR ≡ STORE → ENV → (EXCEPTION + VALUE) × STORE

where ENV is the type of variable binding environments (tables of variable namesand values), STORE the type of stores (tables of addresses and values), EXCEPTIONthe type of exceptions (represented as atomic strings), and VALUE the type of values(Nuprl’s type Top). To “lift” a value v to an expression that always evaluates tov, we use the abbreviation

dve ≡ λs,env. inr(v), s

Lifting is necessary to express the call-by-value evaluation order in the represen-tation of function application and exception handling.

OCaml patterns are represented as elements of EXPR → EXPR → EXPR. Thelanguage core contains only variable patterns.

3.2 Modules and Objects

Modules and compilation units are second class objects of the OCaml programminglanguage. They cannot be used as parts of expressions and have no operationalsemantics per se. Instead, they provide a means for structuring code, for instance

Page 9: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 9

by introducing names for user-defined types and functions and binding them to aparticular expression, describing the signature of an abstract data type, providingalternative implementations of the same function, and disambiguating references tonames that are defined in multiple modules.

As Ensemble’s implementation uses neither functors nor the with operator,it suffices to represent each module expression as a separate object of Nuprl’slibrary, i.e. as object on the meta-level of Nuprl’s type theory. We use meta-levelobject generators to map type and value definitions onto Nuprl abstractions thatglobally declare the corresponding names, prefixed by the name of the module,as abbreviation for a certain expression or type. Module definitions and the open

command are instructions that influence the behavior of these object generatorsand make sure that named references within type-theoretical expressions are linkedto the correct abstraction. As a consequence each module expressions is accessibleas separate Nuprl object and can be reasoned about individually.

We are currently investigating a more general approach that formalizes moduleexpressions within type theory instead of its meta-level. In this approach we repre-sent module expressions as functions that update a global environment. This allowstreating named reference like variables that consult the environment during evalu-ation. The approach is theoretically more satisfactory and also allows representingfunctors as functions on global environments. However, it is also more complex andits consequences for practical reasoning need to be evaluated.

Objects, methods, classes and inheritance are not used in Ensemble’s imple-mentation and currently not represented. Although objects could be understoodas generalization of reference cells, the current type theories do not offer sufficientsupport for expressing the formal semantics of methods and inheritance. Prelimi-nary studies by A. Kopylov indicate that a combination of dependent records andunion types may provide a foundation for representing objects in the future.

3.3 Extent of Formalization

As the main purpose of our formalization of OCaml is to provide a foundation forthe verification, optimization, and synthesis of Ensemble protocols and stack, wehave focused on developing type-theoretical representations of language constructsthat are actually being used instead of aiming at completeness with respect to allminor details of OCaml. The formalization was originally based on OCaml-1.07and later migrated to OCaml-2.02. We are currently working on adding the newfeatures of OCaml-3.0x.

Chapter 6 of the OCaml manual (Leroy, 2000) lists all the language constructsof OCaml and gives their precise syntax and an informal semantics. Our type-theoretical representation provides both a formal semantics of these language con-structs through Nuprl abstractions and a representation of their syntax throughcorresponding display forms. Nuprl precedence objects are being used to controlthe automatic generation of parentheses that follow OCaml’s rules for precedencesand associativity. Finally well-formedness theorems show that the Nuprl represen-tation of a language construct is faithful with respect to OCaml’s type system.

Page 10: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

10 Christoph Kreitz

For instance, representations of expressions must be members of (representationsof) the corresponding types. Our formalization currently includes the following

• All type expressions listed in §6.4, except for object types and #-types. Labelsin (function) types are currently not supported and recursive types have notyet been fully formalized.

• All constants listed in §6.5.

• All patterns listed in §6.6, except for alias patterns, polymorphic (tagged)variant patterns, and variant abbreviation patterns.

• All expressions listed in §6.7, except for polymorphic variants, bitwise logicaland floating point operators, method invocation, object creation, coercion,and object duplication. There is not yet a generic representation of mutuallyrecursive functions. Instead, there are Nuprl definitions for declaring a fixednumber of (up to 6) mutually recursive functions.

• All type definitions listed in §6.8, except for constraint definitions. Optionalprefixes for type parameters, which indicate whether a type constructor is tobe co- or contravariant with respect to that parameter, are not yet supported.

• Classes, listed in §6.9, are not yet supported.

• There is limited support for module specifications, module implementationsand compilation units, listed in §6.10–12, as described in Section 3.2.

The formalization had to take into account that OCaml constructs have amore flexible syntax than the basic terms of type theory. Record type declarations{f 1:T 1; ..; fn:T n} , for instance, can have arbitrarily many components, whileNuprl terms must have a fixed number of subterms. This means that although weknow exactly how to represent record types in Nuprl (i.e. by dependent functiontypes f:FIELD → T [f ]), there cannot be a single Nuprl abstraction for staticallydefining this representation.

Instead, we use an iteration of primitive abstractions to build a formal represen-tation of record types. We introduce an abstraction for a singleton record type

{f 1:T 1} ≡ f:FIELD → (if f = f 1 then T 1 else Top)

and build larger record types using Nuprl’s intersection operator. Since we know{f 1:T 1; f 2:T 2} = {f 1:T 1}∩ {f 2:T 2}, we can represent arbitrary record types asfinite intersections of singleton record types and use the iteration concept forNuprl display forms to make sure that {f 1:T 1}∩ ..∩{fn:T n} is displayed as{f 1:T 1; ..; fn:T n}. For well-formedness reasons, we lift this type to an expres-sion type of the form STORE→ ENV→ (EXCEPTION + T)× STORE while preservingthe display.

Thus in general, OCaml language constructs do not correspond directly toNuprl abstractions but to Nuprl terms that are constructed by combining severalprimitive abstractions. For each construct we have developed a meta-level term gen-erator , i.e. an ML function that builds the Nuprl representation of the construct.These term generators are use crucial for automatically creating the formal rep-resentation of a piece of OCaml code and used by the tools that import OCaml

source code into Nuprl as well as by the tools for synthesizing and optimizing code.

Page 11: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 11

In addition to the formalizations of language constructs we have imported a largefragment of the OCaml libraries listed in §18–28 into Nuprl using the automatictools described in Section 3.5 below. For externally defined library functions thisalso meant providing “external” implementations, i.e. explicit representations towhich the external command could link.

3.4 A Programming Logic for OCaml

The programming logic for OCaml describes the rules for formal reasoning aboutOCaml programs in a way that the underlying type-theoretical concepts becomeinvisible. For each language construct we have developed a top-down sequent rulefor decomposing a program term into smaller fragments while reasoning about itstype or properties (which in Nuprl is formally the same). The rule for decomposingrecord expressions, for instance, is as follows.

∆ ` {f 1=v1; ..; fn=vn} ∈ {f 1:T 1; ..; fn:T n}BY RecordExprMem

∆ ` v1∈ T 1

...∆ ` vn

∈ T n

It states that a record expression {f 1=v1; ..; fn=vn} belongs to some recordtype {f 1:T 1; ..; fn:T n} if we can prove that each field value vi belongs to thecorresponding field type T i. ∆ is a placeholder for the sequent’s assumptions, whichremain unchanged. The rule is implemented as a tactic that decomposes a proofgoal into n subgoals of the form ∆ ` vi

∈ T i , provided the goal matches thefirst line of the rule.

The implementation of the rule as Nuprl tactic guarantees its correctness withrespect to the type-theoretical semantics of OCaml. In most cases the tactic sim-ply calls the well-formedness theorem that comes with the Nuprl representationof the OCaml language construct. For constructs like record expressions, whoserepresentation is built by iterating primitive abstractions, the tactic consists of acontrolled iteration of calls to more primitive decompositions. This makes the ac-tual inference rule a bit more flexible than described above, as the order of fieldsin a record expression is irrelevant for the decomposition.

Similarly we have developed evaluation rules for each reducible OCaml expres-sion. The rule for evaluating the selection of a field from a record expression, forinstance, is the following.

{f 1=v1; ..; fn=vn}.f i −→ vi

The implementation of this rule as Nuprl tactic consists of a controlled appli-cations of Nuprl computation rules, which makes sure that the internal represen-tation of OCaml expressions is not revealed by evaluation.

A complete list of all the inference and evaluation rules for OCaml is givenin (Kreitz, 1997). (The formal model for OCaml in (Kreitz, 1997) is simpler thanthe one described here and only allows a limited representation of imperative fea-tures. This affects the implementation of these rules but not the rules themselves.)

Page 12: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

12 Christoph Kreitz

OCaml

Programming Environment Deductive System

Preprocessor

Camlp4 Conversionmodule

Pretty printermodified

NuPRL-ML

CodeIntermediate

Parser

Ocaml-Code

Text file EXPORT

IMPORT

Print

Represen-

IMPORT

SyntaxTree

Abstract

GeneratorsObject

Term- +

tation

Type InformationDisplay FormsAbstractions

Ocaml-CodeSimulated

basic Ocaml-constructs

Representations of+

NuPRL Library

NuPRL / TYPE THEORY / Meta-Language ML

Fig. 4. Importing and Exporting OCaml source code

3.5 Import and Export of OCaml Code

The type-theoretical semantics of OCaml and its “implementation” as Nuprl ab-stractions and display forms provide the foundation for formal reasoning aboutOCaml programs. However, if we want to reason about real system implementa-tions with ten thousands of lines of code, we have to provide mechanisms that auto-matically translate the OCaml code into its Nuprl representation and vice versa.

Figure 4 illustrates our mechanisms for importing and exporting OCaml sourcecode. The tools for importing OCaml code into Nuprl have to analyze the syn-tax of the OCaml source code, create the type-theoretical terms that representit, and store them as objects in Nuprl’s library. In order to ensure faithfulnesswith respect to the OCaml programming environment we use the Camlp4 parser-preprocessor (de Rauglaudre, 2000), an isolated version of the OCaml-parser, astool for analyzing program text. Camlp4 parses OCaml source code, generatesits abstract syntax tree, and then calls an output module for further processing,e.g. pretty-printing or dumping the binary.

To convert the abstract syntax tree into Nuprl objects, we have implementeda new output module for Camlp4 that creates intermediate code, which causesNuprl to build the corresponding abstractions and display forms. The modulegenerates pieces of this code for each node of the syntax tree, distinguishing thevarious kinds of identifiers, expressions, patterns, types, signature items, and mod-ule expressions as specified in Appendix A of (de Rauglaudre, 2000).

The intermediate code is a program in Nuprl’s meta-language ML that callsthe term and object generators described in Sections 3.3 and 3.2 above to build theNuprl representations of the OCaml code. It will be passed to the Nuprl system,which generates abstractions and display forms for each function, type, and moduledeclared in the source code. In the process, the object generators also resolve issuesthat arise when linking the code of several modules and cannot be addressed bythe parser. These are name resolution (i.e. linking a named reference to a specificobject in some module), determining whether an identifier represents a variable or anamed reference, and resolving overloading of operator names. In addition to that,the object generators also attempt to state and prove a well-formedness theoremfor each newly generated object, using its OCaml signature, if available, or a typeinference algorithm to determine its type.

Page 13: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 13

To make synthesized OCaml programs available to the OCaml programmingenvironment, we have also implemented a mechanism for exporting Nuprl represen-tations of OCaml code. Since we have designed the display forms and precedenceobjects for each OCaml language construct in a way that they obey OCaml’ssyntax requirements, the Nuprl system displays and prints terms that representOCaml programs as syntactically correct OCaml source code. Exporting OCaml

code is therefore simply a matter of selecting the code pieces to be exported andprinting them into a file. The generated program text can then be compiled, linked,and executed in the OCaml environment without further modifications.

4 Formal Optimization

Ensemble’s modular approach to building communication systems has many ad-vantages over monolithic systems. Small components are easier to design, specify,develop, test, verify, and optimize, while application systems may be more read-ily adapted to new environments and extended at run-time with new components.However, building systems from components usually comes with a performancepenalty: the abstraction barriers between the components impose high overheadsarising from additional function calls, redundant code, and code that is not used in aparticular configuration. In this section we will show how Ensemble’s performancecan be significantly improved by employing formal logical optimizations that turnEnsemble into one of the fastest reliable multicast systems currently available.

(Hayden, 1998) suggests several optimizations that could be applied to stack-based architectures like Ensemble that are implemented in functional program-ming languages like OCaml.

1. Avoid garbage collection during message processing by freeing space allocatedfor messages after they have been sent or delivered.

2. Avoid marshaling of small objects when sending data over the net.3. Delay non-critical message processing by sending and delivering messages

before updating a module’s state and buffers.4. Compress the module stack for common sequences of execution, creating a

fast-path that can be used in the common case.5. Compress message headers that are added by micro-protocols while processing

a message that is being sent.

The first three steps do not affect the modular approach as such and can beaddressed when implementing the system modules by adding low-level proceduresthat overwrite defaults of OCaml’s run-time system and by adopting a particularprogramming style. However, the final two optimizations require generating specialcode after an application system has been configured from the modules.

Generating fast-paths for common cases and compressing headers is far beyondthe capabilities of current compiler optimization techniques. Therefore previouswork (Abbott & Peterson, 1993; Engler & Kaashoek, 1996; Hayden, 1998) involvessignificant annotation of the code or hand-optimization, which is a difficult anderror-prone process. By using a formal logical tool like Nuprl, we can completelyautomate these optimization steps and prove that they do not introduce any errors.

Page 14: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

14 Christoph Kreitz

4.1 Fast-path Optimization

In most applications of communication systems it is easy to identify common se-quences of execution. Data are being sent and received and there is no need formessage partitioning, retransmission, buffering, synchronization, etc. This meansthat only a small fragment of the code of a protocol stack is involved in processingthe message and that some micro-protocols are not being activated at all. Fast-pathoptimization aims at improving the system’s performance by identifying the codethat is actually used in the common case.

To understand fast-path optimization, it is useful to

Bottom

no

Top

Pt2Pt

Mnak

Full S

tack

no

APPLICATION

yes

yes

CCPdown

CCPup

NETWORK

TRANSPORT

BypassCode

Fig. 5. Placement ofbypass code in Ensemble

think of a protocol as a function that takes the internalstate of the protocol and an input event and produces anupdated state and a list of output events. In this view,a protocol stack is a macro-protocol built by composingprotocol functions. A protocol can be optimized if wecan describe its regular state and common input events.Formally, we use a Common Case Predicate (CCP), aBoolean function on the state of a protocol and an inputevent, for this purpose. CCPs are usually specified by theprogrammer of a protocol or may be determined fromrun-time statistics.

For example, a CCP may express that the event isa deliver event whose sequence number is equal to thenext expected packet to arrive. If the CCP is satisfied fora received message, then the message may be deliveredand the index of the next expected packet incremented,while the message would have to be buffered otherwise.

Analyzing the path of events that satisfy the commoncase predicate through the code of a protocol stack helpsisolating the code that is actually being used. We call this path a fast-path throughthe protocol stack and the resulting code a bypass, which will be used to processevents in the common case. To decide whether a message can be handled by thebypass code or has to go through the original stack, we use the same CCP thatwas used to generate the bypass code, as illustrated in Figure 5.1 It is necessary togenerate efficient code for this CCP, as it will be executed for every event.

4.2 Knowledge-based Optimization

The optimizations that are necessary to generate bypass code can be expressed asconditional rewrite steps that simplify and evaluate code fragments in the contextof a logical proposition, the common case predicate. To perform these steps, we usethe Nuprl proof development system. Nuprl enables us to automatically generatebypass code and to wrap it by a module to be inserted into Ensemble. Additionally,we can produce a formal proof that the generated code is equivalent to the originalone with respect to the formal semantics of the programming language.

1 The transport module below the protocol stack provides marshaling of messages.

Page 15: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 15

equivalent to

Composition

Stack

Layers

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � ! � � " � � � # � � � � $ $ � � % � � � � � � & & � ' � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �� ( � � � � � � &� ( � � � � � � �� ( � � ) � *� ( � � + � � #� ( � � , % � � � ( � � � ( ( * � � �� ( � � - � # � � � � � ! � ( ( * � � � � �� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �* � � � ! � . � � � / � & � � � / � � � * � 0 � � � � � � � � � � � 0� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

� ( � & � � . 1� � / % / � & 2� � � � % � / * 3 4 � � % � / * � � / � � � � � � � � � � � � � � 5� � / % & � � �2� � � � % � / * 3 4 � � % � / * � � / � � � � � � � � � � � � � � 5� � � � � � / � � 5& � � � 6 ! � 2� & � 7 � � � � � � � 5& � � � � � / %2� & � 7 � � � � � � � 5! � � $ * � " � � 6 ( � / 8� $ � � * 5! � � $ * � & � � � � 6 ( � / 9� & � 7 � � � � � � � � 5! � � $ * � � � � � * � � &:� � � % � / * � � � � * � � & 5! � � $ * � * � � % � � " � $ � � * 5! � � $ * � � � 6 & # � � ( � � � ! � 5! � � $ * � $ * � / � � �;� $ * � / � � � 5! � � $ * � � � $ * � / �:� $ � � * 5! � � $ * � � ( $ * � / � � �9� $ � � * 5! � � $ * � � � � * � �<� $ � � * � � � � � � =

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

* � � � � & � * & � % & � .

* � � � * � & & � * & � % & � 1 � ( � � . � ( 5 � ( � ! � � . � ( � ! 5 � � � � . � � 5 � � * ! � � . � � * ! 5 � � � ! � � . � � � ! = .

* � � ( � � * � � % � $ % � � . � ( � % � $ %

� � � � ( * ! � � * � � % � � . ! � / � " � � � ( � � % # � �� � � � ( � ! � � * � � % . ! � / � " � � � ( � � % # � �� � � � � � � * � � % � $ % . ! � / � " � � � ( � � % # � �� � � � � � ! � � * � � % . ! � / � " � � � ( � � % # � �� � 1 � ( � � . � ( � � * � 5 � ( * ! � � . � ( * ! � � * � 5 � ( � ! � � . � ( � ! � � * � 5 � � � � . � � � � * � 5 � � � ! � � . � � � ! � � * � =

* � * � � " & % & . � � � � � � � � � � � � � * � & - � � � � � � / � * - � � � � � � � � � " & % &

* � . � � � � � � � & � * * � � ! � *� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

> ? @ A B C D B E ? @ B D F B @ E @ G @ ? H I JK L M N O @ O G > O P ? @ Q ? @ R B > G S ? T E UL V> ? @ E W X @ B D J @ B D E @ G @ ? H I O N> ? @ E Y X F B @ J F B @ E @ G @ ? H I O N> ? @ > B B D K E W X E Y V K ? C O @ X C O Z > V J[ \ ] ? ] ? U > B B D K I ] N K K E W X E Y V X ? C O @ V C O Z > ? H ^ _C G @ A Q ? H R O @ Q` a D b K ? H X C E c V ^ _> ? @ E W X ? H E J @ B D K E W X a D b K ? H X C E c V V O N> ? @ K ? C O @ X C O Z > V J E D > O @ d @ B D K ? C O @ X C O Z > V ? H E O NK K K E W X E Y V X ? C O @ V X C O Z > V` e N b K ? H X C E c V ^ _> ? @ E Y X ? H E J F B @ K E Y X e N b K ? H X C E c V V O N> ? @ K ? C O @ X C O Z > V J E D > O @ d F B @ K ? C O @ X C O Z > V ? H E O NK K K E W X E Y V X ? C O @ V X C O Z > VV K K E W X E Y V X ? C O @ V C O Z >O NK L f ] @ ? T Q G N Z > ? T @ G g ? E G E O N c > ? ? H ? N @ G N Z @ Q ? N D G E E ? E O @ @ BL G D D T B D T O G @ ? > G S ? T G N Z @ Q ? N E D > O @ E @ Q ? ? C O @ @ ? Z ? H ? N @ E UL V> ? @ Q Z > T J I ] N A @ O B N` K K E W X E Y V X e N b K ? H X C E c V V ^ _> ? @ E W X ? C O @ @ ? Z J @ B D K E W X e N b K ? H X C E c V V O N> ? @ K ? C O @ X C O Z > V J E D > O @ d @ B D K [ \ ] ? ] ? U ? C D @ S X [ \ ] ? ] ? U ? C D @ S V> B B D K E W X E Y V K ? C O @ X C O Z > V` K K E W X E Y V X a D b K ? H X C E c V V ^ _> ? @ E Y X ? C O @ @ ? Z J F B @ K E Y X a D b K ? H X C E c V V O N> ? @ K ? C O @ X C O Z > V J E D > O @ d F B @ K [ \ ] ? ] ? U ? C D @ S X [ \ ] ? ] ? U ? C D @ S V> B B D K E W X E Y V K ? C O @ X C O Z > VO NK K E W X E Y V X Q Z > T V

Composition Theorems

Up/Linear Up/BounceUp/Split

Dn/Split Dn/BounceDn/Linear

Top Layer

Layer

Layer

Bottom Layer

(static, a priori)

Optimize Common Case

Verify Simple Compositions

Application Stack

(dynamic)

Optimize Common Case

(static, a priori)

Join & Generate Code

Stack Optimization Theorems

Layer Optimization TheoremsUp/Send Up/Cast Dn/Send Dn/Cast

Up/Send Up/Cast Dn/Send Dn/Cast

NuPRL

Code

OCaml Environment

Protocol Layers

Compose Function

Optimized Application Stack

Fig. 6. Optimization methodology: composing optimization theorems.

Experience has shown that purely tactic-based rewrite techniques are not appro-priate for optimizing protocol stacks. Although they are very useful for optimizingindividual micro-protocols (see Section 4.3) they scale badly when protocols are as-sembled into a stack. The reason for that is that the code for composing protocolsmust provide for the situation that a protocol’s input event may generate severaloutput events to be sent to both adjacent protocols. Tracing the path of an eventthus requires an optimization tactic to deal with the formal representation of theentire code of the protocol stack – more than 20,000 lines of code – at once.

Furthermore, although the optimization of individual micro-protocols is largelyautomated, it occasionally requires expertise about the actual implementation tobe provided interactively. In contrast to the programmers who designed the micro-protocols, application designers who configure the components of a communicationsystem to suit a particular application usually do not have this expertise.

Our approach to fast-path optimization takes into account that the implementa-tion of the individual micro-protocols is static for each release of Ensemble, whilethe configuration of a particular application stack is not. This allows us to provideformally verified knowledge about fast-path optimizations of each micro-protocol apriori , i.e. together with the release of Ensemble, and to implement tactics thatuse this knowledge to automatically generate fast-path optimizations of applicationstacks. Formally, we do this by composing optimization theorems as illustratedin Figure 6.

There are two levels of formal optimizations. The first, or static, level (see Sec-tion 4.3) depends solely on the code of the individual micro-protocols and is per-

Page 16: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

16 Christoph Kreitz

formed semi-automatically under the guidance of the programmer who developedthe micro-protocol and a Nuprl expert. At this level we state and prove optimiza-tion theorems about the result of optimizing micro-protocols for four fundamentalcases: down or up going events (sending or receiving messages) for both point-to-point sending and broadcasting. These theorems may take anything from 2 minutesto an entire afternoon to develop and are part of the optimization tool that is madeavailable to the application developer.

The second, or dynamic, level (see Section 4.4) depends on the particular protocolstack configured by an application developer and cannot be provided a priori. Usingcomposition theorems about the effect of applying Ensemble’s composition mech-anism to common combinations of fast-path optimizations, we generate and provetheorems about the result of optimizing the application stack. Header compression,described in Section 4.5 may be integrated at this stage. We then synthesize bypasscode from these theorems and create a module for integrating it into Ensemble

(Section 4.6). This step is completely automated, requires as input only the namesof the micro-protocols that occur in the application stack, and typically obtains itsresult in less than 1/2 minute.

It should be noted that optimization is orthogonal to verification. Our formaltools prove that the resulting code is semantically equal to the original protocolstack but do not make any assumptions about properties of the stack. In the restof this section we will explain the technical details of the optimization tool.

4.3 Optimization of Micro-Protocols

The static optimizations of Ensemble micro-protocols are based on an analysis ofpossible fast-paths, or branches in the protocol code, which essentially describes asimple state-event machine. In principle, each branch could be considered a fast-path whose CCP is composed of the predicates in the corresponding conditionals.However, the common case during communication is usually related to “ordinary”messages being sent or received by the application, and not to handling errors, re-transmissions, group management, etc. Identifying these paths requires insight intothe code, which means that the Ensemble programmer must either annotate thecode or, as currently done, provide the information when initiating the optimizationof the micro-protocol.

The optimization of a protocol layer proceeds by a series of code transformationsthat preserve the semantics of a layer’s code under the assumption of a CCP andare based on the following basic mechanisms:

Function inlining and symbolic evaluation simplifies code in the presence of con-stants or function calls. Logically, this means rewriting the code by unfolding defini-tions and controlled partial evaluation. Both techniques can be expressed in termsof evaluation rules from our programming logic for OCaml (see Section 3.4, whichguarantees their correctness with respect to OCaml’s type-theoretical semantics).

To automate the application of these rules we have developed an evaluationtactic Red, which searches top-down for the first reducible subterm of an OCaml

Page 17: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 17

type header = Variant record type for possible headerstype state = Record type for the layer’s state

let init () (ls,vs) = {Initial state of the layer }let hdlrs s (ls,vs) {up out=up; upnm out=upnm; ...; dnnm out=dnnm}

= ...let up hdlr ev abv hdr = ...and uplm hdlr ev hdr = ...and upnm hdlr ev = ...and dn hdlr ev abv = ...and dnnm hdlr ev = ...in {up in=up hdlr; uplm in=uplm hdlr; ...; dnnm in=dnnm hdlr}

let l args vs = Layer.hdr init hdlrs args vs

Fig. 7. Code structure of Ensemble micro-protocols.

program fragment and reduces it. The search can be restricted by providing asubterm address as additional argument, which allows a user to to leave certainexpressions unchanged while focusing on meaningful reductions.

Directed equality substitution such as the application of distributive laws lead tofurther simplifications of the code. Technically, we apply lemmata from Nuprl’slogical library. Adding a direction to each lemma (e.g., in the case of equality,indicating whether the lemma should be applied from left-to-right or vice versa)guarantees termination of this process.

Context-dependent simplifications help in extracting the relevant code from a micro-protocol. They trace the code path of messages that satisfy the CCPs and isolate thecorresponding code fragments. Technically, we consult a CCP in order to substitutea piece of code by a value and then rewrite the result with the above two mech-anisms. A tactic UseAssumption performs these steps automatically for a givenassumption. Its implementation is straightforward, as Nuprl supports equalityreasoning and the management of hypotheses.

Tailored transformations take advantage of the fact that micro-protocols in En-

semble are coded according to a certain discipline, depicted in Figure 7, whichmakes it possible to create a functional, imperative, or threaded version of Ensem-

ble from a single reference implementation. Each protocol layer l is built from aninitialization function init and an event handler hdlrs, which describes how inputevents affect the state of the protocol and what events will be sent to adjacentprotocols by modifying the event handlers of the surrounding protocol stack. Todo so it performs a case analysis over the direction and type of an input event andtriggers actions accordingly.

This discipline enabled us to write a tactic EvaluateCodeStructure that ap-plies a predetermined series of unfold and evaluation steps, η-reductions, distribu-tive laws, and other “undirected” equalities to isolate the relevant action in theevent handler of a protocol. The tactic performs the tedious initial steps of amicro-protocol optimization automatically and is very efficient, since it does notrequire search.

Page 18: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

18 Christoph Kreitz

Optimizations for each micro-protocol proceed in two phases. In the first phasewe use tactic-based forward reasoning to isolate the fast-path that corresponds tothe CCP and to optimize its code. Optimizations are initiated for four fundamentalcases: down or up going events for both point-to-point sending and broadcasting.Basic CCPs for these cases are created automatically and the Ensemble program-mer may add additional CCPs to describe the common case.

The basic CCP for optimizing Ensemble’s Pt2pt protocol with respect to point-to-point sending, for instance, states that input events have the form DnM(ev, hdr)

where ev is a point-to-point send event (getType ev = ESend). As additional CCPthe Ensemble programmer adds that applications commonly do not send to them-selves (getPeer ev 6= ls.rank). Formally, we generate the following optimizationobject in Nuprl.

OPTIMIZE LAYER Pt2ptFOR EVENT DnM(ev, hdr)AND STATE s pt2ptASSUMING getType ev = ESend ∧ not (getPeer ev = ls.rank)

The Nuprl definition used in this object abbreviates a complex logical formula,which states that we want to show that the event handler of Pt2pt, applied to thestate s pt2pt and input event DnM(ev, hdr), is equivalent to some simpler codeunder the given assumptions.

After initializing the optimization objects we apply the initial fixed series ofsimplifications by calling the tactic EvaluateCodeStructure. As a result, we reacha choice point, i.e. a conditional, in the event handler’s code. We repeatedly callthe tactic UseAssumption to apply context-dependent simplifications for each ofthe conditions in the CCP until the corresponding code fragment has been isolated.All these steps are completely automated and usually the optimization may stopat this point. However, we give the Ensemble programmer an opportunity toinvoke additional simplifications with the tactic Red before committing the resultto Nuprl’s library. Often, about 200-500 lines of code have been reduced to a singleupdate of the protocol’s state and a single output event to be sent to the next layer.

In the second phase, we create an optimization theorem, which proves that, underthe CCP, the bypass code is semantically equal to the protocol from which it wasgenerated. This theorem verifies the correctness of the optimizations and will laterbe used for the optimization of protocol stacks, as illustrated in Section 4.4. Tocreate it, we compose the initial optimization object with its final result into astatement of a formal Nuprl theorem. The optimization of Pt2pt with respect topoint-to-point sending, for instance, leads to the following theorem.

OPTIMIZING LAYER Pt2ptFOR EVENT DnM(ev, hdr)AND STATE s pt2ptASSUMING getType ev = ESend ∧ not (getPeer ev = ls.rank)

YIELDS HANDLERS dn ev (Full (Data (Iq.hi(Arraye.get s pt2pt.sends (getPeer ev))), hdr))

AND UPDATES Iq.add (Arraye.get s pt2pt.sends (getPeer ev))(getIov ev) hdr

Page 19: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 19

Again, we have used a Nuprl definition to make a complex logical formulamore comprehensible to programmers. The formula states that under the givenassumptions applying the event handler of Pt2pt to s pt2pt and DnM(ev, hdr)

leads to the same result as applying the code fragments mentioned as handlers forgenerating output events and as updates for the internal state.

To prove the optimization theorem, we use the trace of the optimization as proofplan that triggers the application of proof tactics, which perform exactly the samesteps on the left hand side of an equation as the rewrite tactics Red, UseAssumption,and EvaluateCodeStructure did on the code of the micro-protocol. As a conse-quence, creating and proving the optimization theorem is completely automatic,even if the original optimization required considerable interaction. It can be trig-gered as background process when the result of an optimization is being committed.

4.4 Stack Optimization

In contrast to individual layers, application protocol stacks cannot be optimizeda priori, as thousands of possible configurations can be generated with the En-

semble toolkit. Since the application developer has little or no knowledge aboutEnsemble’s code, the optimization of an application stack has to be completelyautomatic. We have developed a tool for optimizing arbitrary protocol stacks, whichproceeds as follows (see Figure 6).

Given the names of the micro-protocols in the stack, the system consults theoptimization theorems of these protocols and composes them into optimizationtheorems for the complete stack. This is done separately for the four fundamentalcases – down and up going events for both point-to-point sending and broadcasting– to allow for the creation of independent bypass code fragments.

For composing fast-path optimizations of protocols we apply formal compositiontheorems. These theorems describe abstractly the effect of applying Ensemble’scomposition mechanism to common combinations of fast-paths, such as linear traces(events that pass straight through a layer), bouncing events (events that generatecallback events), and trace splitting (events that cause several events to be emittedfrom a layer) — both for up- and down-going input events.

The formal composition theorem below, for instance, expresses the obvious effectof composing down-going linear fast-paths: if an event passes straight down throughthe upper layer and then through the lower one, then it passes straight through thecomposed layers (Upper ||| Lower) as well. The state update for the combined layeris the combination of the individual updates.

OPTIMIZING LAYER UpperFOR EVENT DnM(ev, hdr) AND STATE s upper

YIELDS HANDLERS dn ev hdr1 AND UPDATES stmt1

∧ OPTIMIZING LAYER LowerFOR EVENT DnM(ev, hdr1) AND STATE s lower

YIELDS EVENTS dn ev hdr2 AND UPDATES stmt2

⇒ OPTIMIZING LAYER Upper ||| LowerFOR EVENT DnM(ev, hdr) AND STATE (s upper, s lower)

YIELDS EVENTS dn ev hdr2 AND UPDATES stmt1; stmt2

Page 20: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

20 Christoph Kreitz

While the statement of such a theorem is almost trivial, its formal proof is quitecomplex, because the code of Ensemble’s composition mechanism uses generalrecursion. By analyzing this code abstractly, and providing the result of the analysisas a formal theorem, we have lifted the optimization process to a higher conceptuallevel: we can now reason about composition as such, instead of having to go into thedetails of the code of each protocol and of the composition mechanism. It also speedsup the optimization process significantly: optimizing composed protocol layers isnow a single reasoning step, while thousands of simplification steps would have tobe applied to the code of the entire stack to achieve the same result.

Like the optimization theorems for individual micro-protocols, all compositiontheorems are provided a priori as part of the optimization tool, as they do notdepend on a particular application stack. As a result, the optimization theoremsfor a protocol stack can be generated and proven automatically. To create thestatement of such a theorem, we compose the statements of the corresponding layeroptimization theorems as described by the statement of the composition theorems.To prove it, we first instantiate the optimization theorems of the layers in the stackwith the actual input event that will enter them. We then apply, step-by-step, theappropriate composition theorems to compose the bypass through the stack.

For optimizing the stack Pt2pt ||| Mnak ||| Bottom with respect to point-to-pointsending, for instance, the layer composition tactic consults the following optimiza-tion theorems.

OPTIMIZING LAYER Pt2ptFOR EVENT DnM(ev, hdr)AND STATE s pt2ptASSUMING getType ev = ESend ∧ not (getPeer ev = ls.rank)

YIELDS HANDLERS dn ev (Full (Data (Iq.hi(Arraye.get s pt2pt.sends (getPeer ev))), hdr))

AND UPDATES Iq.add (Arraye.get s pt2pt.sends (getPeer ev))(getIov ev) hdr

OPTIMIZING LAYER MnakFOR EVENT DnM(ev, hdr)AND STATE s mnakASSUMING getType ev = ESend

YIELDS HANDLERS dn ev (Full nohdr hdr)AND UPDATES ()

OPTIMIZING LAYER BottomFOR EVENT DnM(ev, hdr)AND STATE s bottomASSUMING getType ev = ESend ∧ s bottom.enabled

YIELDS HANDLERS dn ev (Full nohdr hdr)AND UPDATES ()

Since all three optimizations show a linear behavior, we can combine them using thecomposition theorem for linear traces. We use the input event of Pt2pt as input forthe three-layer stack and compose the individual states into a tuple. The resultinghandler code is determined by matching the input and output events of adjacentlayers while the update code is simply composed from the individual updates. Asa result we get the following optimization theorem for the Pt2pt ||| Mnak ||| Bottom.

Page 21: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 21

OPTIMIZING LAYER Pt2pt ||| Mnak ||| BottomFOR EVENT DnM(ev, hdr)AND STATE (s pt2pt, s mnak, s bottom)ASSUMING getType ev = ESend ∧ not (getPeer ev = ls.rank)

∧ s bottom.enabledYIELDS HANDLERS dn ev (Full nohdr (Full nohdr (Full (Data (Iq.hi

(Arraye.get s pt2pt.sends (getPeer ev))), hdr))))AND UPDATES Iq.add (Arraye.get s pt2pt.sends (getPeer ev))

(getIov ev) hdr

The theorem-based approach to fast-path optimization does not only lead to fullyautomated optimization tools but also to a much clearer style of reasoning: theabstraction level of program transformations is raised from low-level expressions ofthe programming language to reasoning about modules as such. This also improvesthe performance of the optimization process, which now requires linear time withrespect to the number of micro-protocols in the stack.

4.5 Header Compression

Optimization theorems do not only describe a fast-path through a protocol stackbut also provide the means for an additional optimization that cannot be achievedby partial evaluation or related techniques. They tell us exactly which headers areadded to a typical data message by the sender’s stack and how the receiver’s stackprocesses these headers in the respective layers. As most of the header fields arenow fixed, we only have to transmit the header fields that may vary. In the stackPt2pt ||| Mnak ||| Bottom, for instance, point-to-point sending creates the header

Full nohdr (Full nohdr (Full (Data(Iq.hi(Arraye.get s pt2pt.sends (getPeer ev)), hdr))))

in which only the emphasized field contains essential information. Transmitting onlythis field will reduce the net load and improve the performance of communication.

For this purpose, we generate code for compressing and expanding headers, andwrap the protocol stack with these two functions. Both functions are generatedautomatically by considering the free variables of the events in the optimizationtheorems. For the stack Pt2pt ||| Mnak ||| Bottom we generate the following functions

let compress hdr = match hdr withFull nohdr (Full nohdr (Full (Data seqno, hdr))) -> OptSend(seqno, hdr)

| Full nohdr (Full (Data (seqno), Full nohdr hdr)) -> OptCast(seqno, hdr)| hdr -> Normal(hdr)

let expand hdr = match hdr withOptSend(seqno, hdr) -> Full nohdr (Full nohdr (Full (Data seqno, hdr)))

| OptCast(seqno, hdr) -> Full nohdr (Full (Data (seqno), Full nohdr hdr))| Normal(hdr) -> hdr

We then optimize the code of the wrapped protocol stack using the same method-ology as above. We have provided generic compression and expansion theorems,which describe the outcome of optimizing a wrapped stack relatively to results ofoptimizing a regular stack, and use them to convert the optimization theorems forthe regular stack into optimization theorems for the wrapped stack. All these stepsare completely automated.

Page 22: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

22 Christoph Kreitz

For point-to-point sending in the stack Pt2pt ||| Mnak ||| Bottom, for instance, com-bining fast-path optimization with compression leads to the following theorem.

OPTIMIZING LAYER Pt2pt ||| Mnak ||| Bottom WRAPPED WITH COMPRESSIONFOR EVENT DnM(ev, hdr)AND STATE (s pt2pt, s mnak, s bottom)ASSUMING getType ev = ESend ∧ not (getPeer ev = ls.rank)

∧ s bottom.enabledYIELDS HANDLERS dn ev (OptSend (Iq.hi

(Arraye.get s pt2pt.sends (getPeer ev))), hdr)AND UPDATES Iq.add (Arraye.get s pt2pt.sends (getPeer ev))

(getIov ev) hdr

It should be noted that integrating compression into the optimization processwill always lead to an improvement in the common case, because the optimizedcode will directly generate or analyze events with compressed headers, instead offirst creating a full header and then compressing it. Only for the non-common casethere will be a slight overhead due to compression.

4.6 Code Generation

All the above optimization steps describe logical operations within the Nuprl sys-tem. In a final step, their results are converted into OCaml code that can becompiled and linked to the rest of the communication system. To generate thiscode, we compose the code fragments from the four optimization theorems intoa single program, which also delays state updates until events have been sent ordelivered, using the CCPs as conditionals that select either one of the fast-paths orthe original stack. The bypass code is then wrapped by code fragments that convertit into a module that can be compiled and linked to Ensemble without furthermodifications.

We have developed a tactic Optimize that combines stack optimization, headercompression, and code generation into a single operation. Given a list of names ofthe micro-protocols in the protocol stack it takes less than 30 seconds to generatethe optimized code and prove it to be equivalent to the original stack.

Experiments in (Liu et al., 1999) have shown that in common applications with 10or more layers the optimized code is significantly more efficient than the original En-

semble application stack. Although the synthesized code has to be integrated intothe functional version of Ensemble, which typically is about 50% slower than theimperative one, it outperforms the best version of Ensemble by a factor of 3 to 5.

5 Verification and Formal Design

In the previous sections we have focused on the foundations for formal reasoningabout the the code of modular systems and on logic-based tools for optimizingtheir performance. We will now briefly address the remaining aspects of a formalinfrastructure for building reliable, high-performance networks, namely verificationand formal design.

Page 23: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 23

5.1 Specification and Correctness

The goal of specification is to give a precise description of a system and to defineand document its features. Specifications can be used to guide the configuration ofapplication systems from modules, to support the design of new modules, and todetermine whether an implementation is correct .

Specifications range from specifying the behavior of a system to specifying itsproperties. Both kinds of specifications are important. Properties describe the sys-tem at the highest level while behavioral specifications describe how to implementthe properties. Behavioral specifications can be either concrete or abstract . Abstractspecifications are non-deterministic descriptions of a system’s global behavior. Con-crete specifications give deterministic descriptions of a system’s components and caneasily be mapped onto executable code.

As an example consider a FIFO network, which is characterized by the propertythat messages are received in the same order in which they were sent . An abstractbehavioral specification would introduce a global queue of events in transit andstate that messages may be appended to the end of the queue and be removed fromits beginning . A concrete behavioral specification would describe a protocol thatattaches sequence numbers to messages and require that incoming messages whosesequence number is too big will be buffered . At the lowest level we find the imple-mentation of this protocol, for instance as Ensemble’s Pt2pt module.

The relationship between the four lev-

AbstractNetworkModel

Scheduling

Refinement Proof

Implementation

Properties

SpecificationAbstract Behavioral

Concrete BehavioralSpecification

els of specification can be pictured as fol-lows. Properties of an abstract specificationcan be derived by proof . A concrete spec-ification is derived from the abstract spec-ification by refinement , which involves de-signing a protocol that implements the ab-stract requirements with respect to someabstract network model. The implementa-tion is linked to the concrete specification by scheduling the order of actions andcoding them in a specific programming language.

Introducing several levels of abstraction makes it feasible to prove propertiesof a system’s implementation or to derive an implementation from properties –establishing a direct link between properties and implementation would be muchharder. Formal verification and design of networked systems in Nuprl thereforerequires a representation of all four level is Nuprl’s type theory.

For the highest level, the abstract system properties, we have developed a formalmodel of communication (Bickford et al., 2001a; Bickford et al., 2001b; Bickfordet al., 2001c) that enables us to reason in Nuprl about properties of global tracesof events such as reliability, confidentiality, message ordering, etc. Abstract andconcrete system specifications are expressed in terms of non-deterministic and de-terministic IO-automata (IOA) (Lynch, 1996), which are abstractions of the state-event machines implicitly used in the descriptions of network protocols. A type-theoretic representation of IO-automata has been developed in (Hickey et al., 1999;

Page 24: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

24 Christoph Kreitz

Bickford & Hickey, 1999). The implementation level of Ensemble is the program-ming language OCaml (Leroy, 2000), whose representation is discussed in Section 3.

A formal verification exploits the above relationship between the four levels. Asin the case of fast-path optimizations, a compositional approach is taken: individualmicro-protocols are verified independently and the verification of protocol stacks isbased on IOA composition, which is proven to preserve all safety properties of thecomponents. A good example of such a proof can be found in (Hickey et al., 1999),which demonstrates the correctness of one of Ensemble’s total ordering protocolsand located a subtle bug in the original implementation.

5.2 Formal Design

Formal methods can have a large impact when being engaged at the earliest stagesof design and implementation. At this stage it is possible to state assumptions andgoals that drive the system design and to use proof environments to clarify thesegoals, to explore ideas, and to detect flaws in the design before it is being coded.

In (Liu et al., 2001; Bickford et al., 2001a; Bickford et al., 2001c) we show howthe Nuprl system has contributed to the design and implementation of a verifiablycorrect adaptive network protocol for Ensemble. The protocol was realized as ahybrid protocol that switches between specialized protocols and formally provencorrect with the Nuprl system. In the process we have developed a characteriza-tion of communication properties that can be preserved by dynamic switching. Wehave introduced the concept of meta-properties to abstractly describe switchableproperties and have shown that six meta-properties are sufficient for protocols towork correctly under a switch. We also have characterized a switch-invariant thatan implementation of the switch has to satisfy to preserve switchable properties.

The verification efforts revealed hidden assumptions that are crucial for the cor-rectness of the implementation and showed limitations for the use of such a genericprotocol that might otherwise have gone unnoticed. This demonstrates that engag-ing proof systems such as Nuprl at the earliest stages of design and implementationadds value to all subsequent stages and creates valuable information needed for themaintenance and evolution of software.

6 Related Work

The CMU Fox project (Biagioni, 1994) uses an extension of Standard ML for buildingprotocol stacks. Its broader goal is to investigate the extent to which high levellanguages like ML are suitable for systems programming.

Integrated Layer Processing (ILP) (Clark & Tennenhouse, 1990) is an approachto reduce the overhead of layered protocols (Abbott & Peterson, 1993). The FilterFusion Compiler (FFC) (Proebsting & Watterson, 1996) implements ILP usingpartial evaluation, but has only been applied to very simple protocols. Furthermore,the code generated by FFC has to be hand-modified to get good performance.

The Esterel compiler (Castelluccia & Dabbous, 1996) is used to convert a proto-col specification into a sequential finite automaton, from which efficient C code is

Page 25: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 25

generated. Esterel was used to specify and implement a large subset of the TCP pro-tocol, but it does not scale easily to arbitrary protocol stacks. Furthermore it cannotformally ensure the correctness of either the optimization or the protocol itself.

In operating system research there is related work on locating and optimizingcommon paths. Synthesis (Massalin, 1992) uses a run-time code generator to opti-mize the most frequently used kernel routines. (Pu et al., 1995) describes work onoptimizing Synthetix kernel functions by reducing the length of common paths.

A variety of formal systems has been used for verifying and synthesizing hardwareand software systems. Model checking (McMillan, 1993; Manna & Pnueli, 1995;Clarke et al., 1999) has been very successful in the verification of hardware andfinite state software systems. Numerous systems (Cleaveland & et. al, 1994; Dill,1996; Holzmann, 1997) have been used in a variety of case studies. But so far theyare restricted to finite state systems and of limited value for reasoning about thecomplete code of real distributed systems. Deductive model checking (Finkbeineret al., 1998; Sipma et al., 1999) combines model checking with automated deductionfor the verification of infinite-state and real-time systems. Although this approachis very promising, it is not yet applicable to distributed systems.

There have been numerous applications of Isabelle (Paulson, 1990; Paulson, 1998;Paulson, 1999), PVS (Lincoln & Rushby, 1993; Owre et al., 1996; Rushby, 1994;Rushby, 1997; Rushby, 1999), and ACL2 (Kaufmann et al., 2000; ACL2, n.d.) toverifying abstract communication protocols. Few projects, however, deal with theactual code of real-world systems.

The KIDS system (Smith, 1991) and its successor SPECWARE (Srinivas & Jullig,1995) support the modular construction of formal specifications and their refine-ment into executable code. These systems have been very successful in practical ap-plications (Smith & Parra, 1993; Gomes et al., 1996; Blaine et al., 1998; Westfold &Smith, 2001) but are currently limited to the development of sequential algorithms.

7 Conclusion

We have described an embedding of the Ensemble group communication toolkit tothe Nuprl proof development system that is based on a type-theoretical semanticsof Ensemble’s implementation language OCaml. The formal link between Ensem-

ble and Nuprl provides an infrastructure for the application of logical inferencetechniques to the actual code of a modular, real-world system. Using this infrastruc-ture we have shown how to build logic-based fast-path optimization tools that cansignificantly improve the performance of the already optimized Ensemble systemin concrete applications and are guaranteed not to introduce any errors. Our resultsshow that logical methods for program synthesis, verification, and optimization canbe made to scale effectively to large software systems.

This article extends preliminary work reported in (Kreitz, 1997; Kreitz et al.,1998; Kreitz, 1999), which now has matured into pushbutton technology and isbased on an advanced semantical model for OCaml, which allows for a type-theoretical representation of a larger fragment of the programming language.

Although we chose a limited application domain – all Ensemble configurations

Page 26: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

26 Christoph Kreitz

are stacks of micro-protocols, which can be considered state-event machines – webelieve that our approach could generalize and scale to more general configurationand component types. We believe that the following ingredients were key to thesuccess of our approach:

1. Using small and simple system components, which are easier to reason about.2. Using a well-defined configuration operation on components.3. Using a mostly functional implementation of components in a language with

a formal semantics, which allows formal manipulations.4. Using and continuously expanding a tactical proof system for a rich specifi-

cation language such as Nuprl, which makes verification and optimizationtechniques scale.

5. Using several layers of formal abstraction, libraries of verified formal knowl-edge, and compositional reasoning, which makes formal techniques indepen-dent from a particular application domain.

6. Using a collaboration between systems and theorem proving groups, as thejoint expertise is required for making formal methods apply to real systems.

We believe that it may be possible to use our approach in other complex sys-tems such as file systems, atomic transaction protocols, optimizing compilers, andperhaps eventually an entire operating system kernel, as the formal techniquesdescribed here can be applied to other modular systems whose components andcomposition mechanisms can be described semantically.

We also hope to elaborate our optimization technique into one that would allowus to detect common combinations at run-time, and generate the optimized codedynamically, using layer optimization theorems for all possible bypass paths. We canthen make use of Ensemble’s support for dynamically loading layers and switchingprotocol stacks on the fly (van Renesse et al., 1999).

We also intend to extend our work on formalizing the semantics of programminglanguages. Our methods would have a wider impact if we could express the typetheoretical semantics of OCaml’s classes and object and also target other back-endlanguages such as Java, using preliminary work in (Attalli et al., 1998; Naumov,1998). This requires, however, expanding our logical foundations, as type theoriesdo not yet offer sufficient support for objects, methods, classes, and inheritance.

Finally we will continue our research on the verification and formal design ofdistributed systems and also plan to refine and extend our formal models to includereasoning about embedded systems with bandwidth and resource limitations, andtime constraints.

Acknowledgements

The work reported in this paper would not have been possible without the help andinspiration of colleagues – Mark Bickford, Ken Birman, Robert Constable, RichardEaton, Mark Hayden, Jason Hickey, Xiaoming Liu, Lori Lorigo, and Robbert VanRenesse – from both the systems and theorem proving groups at Cornell.

This work is supported in part by ARPA/RADC grants F30602-98-2-0198 andF30602-99-1-0532, and ARPA/ONR grant N00014-01-0765.

Page 27: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 27

References

Aagaard, M., & Leeser, M. (1993). Verifying a logic synthesis tool in Nuprl. Pages 72–83 of: Bochmann, G., & Probst, D. (eds), Workshop on computer-aided verification.Lecture Notes in Computer Science, vol. 663. Springer Verlag. Technical Report TR93-1335.

Abbott, M., & Peterson, L. (1993). Increasing network throughput by integrating protocollayers. Ieee/acm transactions on networking, 1(5), 600–610.

ACL2. ACL2 home page. http://www.cs.utexas.edu/users/moore/acl2.

ALFA. ALFA home page. http://www.cs.chalmers.se/ hallgren/Alfa.

Allen, Stuart, Constable, Robert, Eaton, Richard, Kreitz, Christoph, & Lorigo, Lori.(2000). The Nuprl open logical environment. Pages 170–176 of: McAllester, D. (ed),17th conference on automated deduction. Lecture Notes in Artificial Intelligence, vol.1831. Springer Verlag.

Altenkirch, Thorsten, Gaspes, Veronica, Nordstrom, Bengt, & von Sydow, Bjorn. (1994).A user’s guide to ALF. University of Goteborg.

Andrews, Peter B., Bishop, Matthew, Issar, Sunil, Nesmith, Dan, Pfenning, Frank, & Xi,Hongwei. (1996). TPS: A theorem proving system for classical type theory. Journal ofautomated reasoning, 16(3), 321–353.

Attalli, Isabelle, Caromel, Denise, & Russo, Marjorie. (1998). A formal executable seman-tics for Java. Oopsla’98 workshop on the formal underpinnings of java.

Biagioni, E. (1994). A structured TCP in Standard ML. Pages 36–45 of: Acm sigcommconference.

Bickford, Mark, & Hickey, Jason. (1999). Predicate transformers for infinite-state au-tomata in NuPRL type theory. Irish formal methods workshop.

Bickford, Mark, Kreitz, Christoph, van Renesse, Robbert, & Constable, Robert. (2001a).An experiment in formal design using meta-properties. Pages 100–107 of: Lala, J.,Maughan, D., McCollum, C., & Witten, B. (eds), Darpa information survivability con-ference and exposition ii (discex 2001), vol. II. IEEE Computer Society Press.

Bickford, Mark, Kreitz, Christoph, & van Renesse, Robbert. (2001b). Formally verifyinghybrid protocols with the nuprl logical programming environment. Tech. rept. CornellCS:2001-1839. Cornell University. Department of Computer Science.

Bickford, Mark, Kreitz, Christoph, van Renesse, Robbert, & Liu, Xiaoming. (2001c). Prov-ing hybrid protocols correct. Pages 105–120 of: Boulton, Richard, & Jackson, Paul (eds),14th international conference on theorem proving in higher order logics. Lecture Notesin Computer Science, vol. 2152. Springer Verlag.

Birman, Ken, Constable, Robert, Hayden, Mark, Hickey, Jason, Kreitz, Christoph, vanRenesse, Robbert, Rodeh, Ohad, & Vogels, Werner. (2000). The Horus and Ensem-ble projects: Accomplishments and limitations. Pages 149–160 of: Darpa informationsurvivability conference and exposition (discex 2000). IEEE Computer Society Press.

Birman, K.P., & van Renesse, Robbert. (1994). Reliable Distributed Computing with theIsis Toolkit. IEEE Computer Society Press.

Blaine, Lee, Gilham, Li-Mei, Liu, Junbo, Smith, Douglas R., & Westfold, Stephen. (1998).Planware – domain-specific synthesis of high-performance schedulers. Pages 270–280 of:Thirteenth automated software engineering conference. IEEE Computer Society Press.

Castelluccia, C., & Dabbous, W. (1996). Generating efficient protocol code from an ab-stract specification. Pages 60–72 of: Acm sigcomm conference.

Clark, D., & Tennenhouse, D. (1990). Architectural consideration for a new generation ofprotocols. Pages 200–208 of: Acm sigcomm conference.

Clarke, E. M., Grumberg, O., & Peled, D. (1999). Model checking. MIT Press.

Page 28: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

28 Christoph Kreitz

Cleaveland, R., & et. al. (1994). The concurrency factory – practical tools for the specifica-tion, simulation, verification, and implementation of concurrent systems. Pages 75–90of: Blelloch, G.E., Chandy, K.M., & Jagannathan, S. (eds), Specification of parallelalgorithms.

Constable, Robert L. (1998). Types in logic, mathematics, and programming. Chap. X,pages 684–786 of: Buss, S. R. (ed), Handbook of proof theory. Elsevier Science PublishersB.V.

Constable, Robert L., & Hickey, Jason. (2000). Nuprl’s Class Theory and its Applications.Pages 91–116 of: Bauer, Friedrich L., & Steinbrueggen, Ralf (eds), Foundations of securecomputation. NATO ASI Series, Series F: Computer & System Sciences. IOS Press.

Constable, Robert L., Allen, Stuart F., Bromley, H. Mark, Cleaveland, W. Rance, Cremer,J. F., Harper, Robert W., Howe, Douglas J., Knoblock, Todd B., Mendler, Nax Paul,Panangaden, Prakash, Sasaki, Jim T., & Smith, Scott F. (1986). Implementing mathe-matics with the NuPRL proof development system. Prentice Hall.

Coq. Coq home page. http://pauillac.inria.fr/coq/coq-eng.html.

de Rauglaudre, Daniel. (2000). Camlp4 version 3.00. Institut National de Recherche enInformatique et en Automatique.

Dill, David. (1996). The Murphi verification system. Pages 390–393 of: Alur, R., &Henzinger, T. (eds), Computer aided verification (cav’96). Lecture Notes in ComputerScience.

Dowek, G., & et. al. (1991). The Coq proof assistant user’s guide. Institut National deRecherche en Informatique et en Automatique. Report RR 134.

Engler, D., & Kaashoek, M. (1996). DPF: Fast, flexible message demultiplexing usingdynamic code generation. Pages 53–59 of: Acm sigcomm conference.

Finkbeiner, B., Manna, Z., & Sipma, H. (1998). Deductive verification of modular systems.Pages 239–275 of: Compositionality: The significant difference (compos’97). LectureNotes in Computer Science, vol. 1536. Springer Verlag.

Gomes, Carla P., Smith, Douglas R., & Westfold, Stephen J. (1996). Synthesis of sched-ulers for planned shutdowns of power plants. Pages 12–20 of: Eleventh knowledge-basedsoftware engineering conference. IEEE Computer Society Press.

Gordon, Michael, & Melham, T. (1993). Introduction to hol: a theorem proving environ-ment for higher-order logic. Cambridge University Press.

Hayden, Mark. (1998). The ensemble system. Ph.D. thesis, Cornell University. Departmentof Computer Science. Technical Report TR98-1662.

Hickey, J., Lynch, N., & van Renesse, R. (1999). Specifications and proofs for Ensemblelayers. Pages 119–133 of: Cleaveland, R. (ed), 5th international conference on tools andalgorithms for the construction and analysis of systems. Lecture Notes in ComputerScience, vol. 1579. Springer Verlag.

Hickey, Jason. (1996). Formal objects in type theory using very dependent types. Foun-dations of object oriented languages 3. Williams College.

Hickey, Jason. (2001). The MetaPRL logical programming environment. Ph.D. thesis,Cornell University. Department of Computer Science.

Hickey, Jason, & Nogin, Aleksey. (2000). Fast tactic-based theorem proving. Pages 252–266 of: Harrison, J., & Aagaard, M. (eds), Theorem proving in higher order logics: 13thinternational conference, tphols 2000. Lecture Notes in Computer Science, vol. 1869.Springer Verlag.

HOL. HOL home page. http://www.cl.cam.ac.uk/Research/HVG/HOL.

Holzmann, G. J. (1997). The model checker SPIN. Ieee transactions on software engi-neering, 23(5), 279–295.

Page 29: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 29

Howe, D.J. (1996). Importing mathematics from HOL into NuPRL. Pages 267–282 of:von Wright, J., Grundy, J., & Harrison, J. (eds), Theorem proving in higher order logics.Lecture Notes in Computer Science, vol. 1125. Springer Verlag.

Howe, Douglas J. (1987). The computational behaviour of Girard’s paradox. Pages 205–214 of: Gries, David (ed), Lics-87 – second annual symposium on logic in computerscience, ithaca, new york, usa, june 1987. IEEE Computer Society Press.

Isabelle. Isabelle home page. http://www.cl.cam.ac.uk/Research/HVG/Isabelle.

Jackson, Paul B. (1994). Exploring abstract algebra in constructive type theory. Bundy,Alan (ed), 12th conference on automated deduction. Lecture Notes in Artificial Intelli-gence, vol. 814. Springer Verlag.

Kaufmann, Matt, Manolios, Panagiotis, & Moore, J Strother. (2000). Computer-aidedreasoning: ACL2 case studies. Kluwer Academic Publishers.

Kopylov, Alexei. (2000). Dependent intersection: A new way of defining records in typetheory. Tech. rept. TR 2000-1809. Cornell University. Department of Computer Science.

Kreitz, Christoph. 1997 (June). Formal reasoning about communication systems I: Em-bedding ML into type theory. Tech. rept. TR97-1637. Cornell University. Department ofComputer Science.

Kreitz, Christoph. (1999). Automated fast-track reconfiguration of group communicationsystems. Pages 104–118 of: Cleaveland, R. (ed), 5th international conference on toolsand algorithms for the construction and analysis of systems. Lecture Notes in ComputerScience, no. 1579. Springer Verlag.

Kreitz, Christoph, Hayden, Mark, & Hickey, Jason. (1998). A proof environment forthe development of group communication systems. Pages 317–331 of: Kirchner, C., &Kirchner, H. (eds), 15th conference on automated deduction. Lecture Notes in ArtificialIntelligence, vol. 1421. Springer Verlag.

Leino, K. Rustan M. (2000). Extended static checking: A ten-year perspective. Pages 157–175 of: Wilhelm, R. (ed), Informatics: 10 years back, 10 years ahead. Lecture Notes inComputer Science, vol. 2000. Springer Verlag.

Leroy, Xavier. (2000). The Objective Caml system release 3.00. Institut National deRecherche en Informatique et en Automatique.

Lincoln, Patrick, & Rushby, John. (1993). A formally verified algorithm for interactiveconsistency under a hybrid fault model. Pages 402–411 of: 23rd fault-tolerant computingsymposium.

Liu, Xiaoming, Kreitz, Christoph, van Renesse, Robbert, Hickey, Jason, Hayden, Mark,Birman, Kenneth, & Constable, Robert. (1999). Building reliable, high-performancecommunication systems from components. Pages 80–92 of: 17th acm symposium onoperating systems principles (sosp’99). Operating Systems Review, vol. 34, no. 5.

Liu, Xiaoming, van Renesse, Robbert, Bickford, Mark, Kreitz, Christoph, & Constable,Robert. (2001). Protocol switching: Exploiting meta-properties. Pages 37–42 of: Ro-drigues, Luis, & Raynal, Michel (eds), International workshop on applied reliable groupcommunication (wargc 2001). IEEE Computer Society Press.

Lynch, Nancy. (1996). Distributed algorithms. Morgan Kaufmann.

Manna, Z., & Pnueli, A. (1995). Temporal verification of reactive systems. Springer Verlag.

Martin-Lof, Per. (1984). Intuitionistic Type Theory. Studies in Proof Theory LectureNotes, vol. 1. Napoli: Bibliopolis.

Massalin, H. (1992). Synthesis: An efficient implementation of fundamental operatingsystem services. Ph.D. thesis, Computer Science Department, Columbia University.

McMillan, K. L. (1993). Symbolic model checking. Kluwer Academic Publishers.

MetaPRL. Metaprl home page. http://metaprl.org.

Page 30: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

30 Christoph Kreitz

Murthy, Chetan R., & Russell, James R. (1990). A constructive proof of Higman’s lemma.Pages 257–267 of: Mitchell, John C. (ed), Lics-90 – fifth annual symposium on logic incomputer science, 1990, philadelphia, pa, june 1990. IEEE Computer Society Press.

Naumov, Pavel. (1998). Formalizing reference types in nuprl. Ph.D. thesis, Cornell Uni-versity. Department of Computer Science.

NuPRL. Nuprl home page. http://www.cs.cornell.edu/Info/Projects/NuPrl.

Owre, S., Rajan, S., Rushby, J. M., Shankar, N., & Srivas, M. K. (1996). PVS: Combiningspecification, proof checking and model checking. Pages 411–414 of: Alur, Rajeev, &Henzinger, Thomas A. (eds), Computer-aided verification. Lecture Notes in ComputerScience, no. 1102.

Paulson, L. (1999). Inductive analysis of the internet protocol TLS. Acm transactions oncomputer and system security, 2(3), 332–351.

Paulson, Lawrence C. (1990). Isabelle: The next 700 theorem provers. Pages 361–386 of:Odifreddi, Piergiorgio (ed), Logic and computer science. Academic Press.

Paulson, Lawrence C. (1998). The inductive approach to verifying cryptographic protocols.Journal of computer security, 6, 85–128.

Pfenning, Frank, & Schurmann, Carsten. (1999). Twelf—a meta-logical framework fordeductive systems. Pages 202–206 of: Ganzinger, H. (ed), 16th conference on automateddeduction. Lecture Notes in Artificial Intelligence, vol. 1632.

Pollack, Robert. (1994). The theory of LEGO – a proof checker for the extendend calculusof constructions. Ph.D. thesis, University of Edinburgh.

Proebsting, T., & Watterson, S. (1996). Filter fusion. Pages 119–130 of: 23th acm sym-posium on principles of programming languages.

Pu, C., Autrey, T., Black, A., Consel, C., Cowan, C., Inouye, J., Kethana, L., Walpole,J., & Zhang, K. (1995). Optimistic incremental specialization. Pages 314–321 of: 15thacm symposium on operating systems principles.

PVS. PVS home page. http://pvs.csl.sri.com.

Rushby, John. (1994). A formally verified algorithm for clock synchronization under ahybrid fault model. Pages 304–313 of: 13th acm symposium on principles of distributedcomputing.

Rushby, John. (1997). Systematic formal verification for fault-tolerant time-triggered algo-rithms. Pages 191–210 of: Meadows, Catherine, & Sanders, William (eds), Dependablecomputing for critical applications: 6. IEEE Computer Society Press.

Rushby, John. (1999). Systematic formal verification for fault-tolerant time-triggered al-gorithms. Ieee transactions on software engineering, 25(5), 651–660.

Schmitt, Stephan, Lorigo, Lori, Kreitz, Christoph, & Nogin, Alexey. (2001). JProver:Integrating connection-based theorem proving into interactive proof assistants. Pages421–426 of: Gore, R., Leitsch, A., & Nipkow, T. (eds), International joint conferenceon automated reasoning. Lecture Notes in Artificial Intelligence, vol. 2083. SpringerVerlag.

Schneider, Fred B., Morrisett, Greg, & Harper, Robert. (2000). A language-based approachto security. Pages 86–101 of: Wilhelm, R. (ed), Informatics: 10 years back, 10 yearsahead. Lecture Notes in Computer Science, vol. 2000. Springer Verlag.

Sipma, H., Uribe, T., & Manna, Z. (1999). Deductive model checking. Formal methodsin system design, 15, 49–74.

Smith, Douglas R. (1991). KIDS — a knowledge-based software development system.Pages 483–514 of: Lowry, Michael R., & McCartney, Robert D. (eds), Automating soft-ware design. Menlo Park, CA: AAAI Press / The MIT Press.

Smith, Douglas R., & Parra, Eduardo A. (1993). Transformational approach to transporta-

Page 31: Building Reliable, High-Performance Networks with …...Building Reliable, High-Performance Networks 3 2 Preliminaries 2.1 Nuprl The Nuprl proof development system (Constable et al.,

Building Reliable, High-Performance Networks 31

tion scheduling. Pages 60–68 of: 8th knowledge-based software engineering conference,chicago, il, september 1993.

Srinivas, Y. V., & Jullig, Richard. (1995). SPECWARE: Formal Support for composingsoftware. International conference on the mathematics of program construction.

TPS. TPS home page. http://gtps.math.cmu.edu/tps.html.

van Renesse, R., Birman, K., & Maffeis, S. (1996). Horus: A flexible group communicationsystem. Communications of the association for computing machinery, 39(4), 76–83.

van Renesse, Robbert, Birman, Ken, Hayden, Mark, Vaysburg, Alexey, & Karr, David.(1999). Building adaptive systems using Ensemble. Software: Practice and experience,28(9), 963–979.

Westfold, S. J., & Smith, D. R. (2001). Synthesis of efficient constraint satisfaction pro-grams. Knowledge engineering reviews.