mechanised metatheory for the sail isa specification languagenk480/sail-isabelle.pdf · pl’18,...

13
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 Mechanised Metatheory for the Sail ISA Specification Language Mark Wassell, Alasdair Armstrong, Neel Krishnaswami, Peter Sewell University of Cambridge, UK {mpew2,aa2019,nk480,pes20}@cl.cam.ac.uk Abstract Sail is a language for rigorous specification of instruction set architectures (ISAs); it has been used to model various production and research architectures, including ARMv8-A, RISC-V, and CHERI-MIPS, sufficiently completely to boot multiple operating systems. Intended to be engineer friendly, Sail is an imperative first-order language with a light-weight dependent type system; it can generate OCaml and C em- ulators and Isabelle, HOL4, and Coq definitions. A recent substantial redesign of the Sail type system has been done in conjunction with the development of a core calculus, Mini- Sail, along with paper proofs of type safety. This paper presents further work to mechanise MiniSail in the Isabelle theorem prover, making use of the Nomi- nal2 Isabelle library to address binding structures and alpha- equivalence; it includes the definition of the MiniSail type system and operational semantics and proofs of preservation and progress. We discuss how the mechanisation and paper formalisations relate to each other, including the benefits and pitfalls of each, and comment on how these have influenced and been influenced by the Sail implementation redesign. We also comment on whether the use of Nominal Isabelle allows us to write proofs of programming language safety that are human readable as well as being machine verified. This mechanisation should in future provide a platform for the mechanical generation of a verified implementation of a type checker and evaluator for the language. Keywords Language Verification, Isabelle/HOL, CPU Spec- ification 1 Introduction CPU architectures such as ARM, IBM POWER, MIPS and x86 are a key part of our computing infrastructure and so it is vital that they are engineered without defects or vul- nerabilities. Commonly, an instruction set architecture (ISA) is specified using a mix of English prose and imperative pseudo-code. These specifications are complex and difficult to reason with and so a more rigorous approach is warranted. Sail [2] has been developed as a language for modelling in- struction set architectures. It supports exporting the model to theorem provers, for architecture property reasoning, and to C, to provide an executable of the model. The latter is PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model so that it boots a variety of operating systems. We introduce the Sail language in Section 2. Sail has a novel light-weight dependent type system and uses the Z3 [6] SMT solver to assist with type checking. This type checker has had to be carefully engineered to ensure the decidability of type checking. To avoid the reliance on this engineering, our prior work work [2] introduced MiniSail, a kernel language for Sail and presented a paper formalisation of MiniSail that provided a number of guarantees about the Sail type system including decidability. Decidability of type checking arises from the following: First, the typing rules are syntax directed. This ensures that there is a deterministic choice when deciding which rule to use to type check a particular term. Second, we only send to the SMT solver well-sorted problems in the quantifier-free linear arithmetic logic fragment. The problems are guaran- teed to be in this logic fragment through a combination of limits on the syntax (e.g. we don’t have quantifiers in the constraint syntax) and sort checking. We review the choice of syntax, the type system and the operational semantics of MiniSail in Section 3. This paper builds on the paper formalisation and describes a mechanisation of MiniSail in the Isabelle theorem prover [12]. This mechanisation provides a stronger guarantee than the paper proofs and gives a platform for the building of a validated type checker and executable for MiniSail making use of Isabelle’s code generation features. The mechanisation has also uncovered a shortcoming in the paper proofs around the area of sort checking (that we detail later). We make use of Isar, the structured proof language of Is- abelle, which allows us to write proofs that resemble how paper proofs look. To address binding, α -equivalence and capture avoiding substitution, we make use of Nominal logic as provided by the Nominal2 Isabelle [4] package. The gram- mar of MiniSail is specified using Isabelle datatypes and the typing and reduction semantics by inductive predicates. We prove the type safety property for MiniSail in the usual way using progress and preservation lemmas. Progress tells us that a statement is either a value or can be reduced to another statement, and preservation tells us that types are preserved across reduction. The proofs of preservation make exten- sive use of substitution lemmas, which form the core of the mechanisation. We present the architecture of the Isabelle mechanisation in Section 4 1

Upload: others

Post on 14-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

Mechanised Metatheory for the Sail ISA SpecificationLanguage

Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter SewellUniversity of Cambridge UK

mpew2aa2019nk480pes20clcamacuk

AbstractSail is a language for rigorous specification of instructionset architectures (ISAs) it has been used to model variousproduction and research architectures including ARMv8-ARISC-V and CHERI-MIPS sufficiently completely to bootmultiple operating systems Intended to be engineer friendlySail is an imperative first-order language with a light-weightdependent type system it can generate OCaml and C em-ulators and Isabelle HOL4 and Coq definitions A recentsubstantial redesign of the Sail type system has been done inconjunction with the development of a core calculus Mini-Sail along with paper proofs of type safetyThis paper presents further work to mechanise MiniSail

in the Isabelle theorem prover making use of the Nomi-nal2 Isabelle library to address binding structures and alpha-equivalence it includes the definition of the MiniSail typesystem and operational semantics and proofs of preservationand progress We discuss how the mechanisation and paperformalisations relate to each other including the benefits andpitfalls of each and comment on how these have influencedand been influenced by the Sail implementation redesignWe also comment on whether the use of Nominal Isabelleallows us to write proofs of programming language safetythat are human readable as well as being machine verifiedThis mechanisation should in future provide a platform forthe mechanical generation of a verified implementation of atype checker and evaluator for the language

Keywords Language Verification IsabelleHOL CPU Spec-ification

1 IntroductionCPU architectures such as ARM IBM POWER MIPS andx86 are a key part of our computing infrastructure and soit is vital that they are engineered without defects or vul-nerabilities Commonly an instruction set architecture (ISA)is specified using a mix of English prose and imperativepseudo-code These specifications are complex and difficultto reason with and so a more rigorous approach is warrantedSail [2] has been developed as a language for modelling in-struction set architectures It supports exporting the modelto theorem provers for architecture property reasoning andto C to provide an executable of the model The latter is

PLrsquo18 January 01ndash03 2018 New York NY USA2018

comprehensive enough to run the model so that it boots avariety of operating systems We introduce the Sail languagein Section 2Sail has a novel light-weight dependent type system and

uses the Z3 [6] SMT solver to assist with type checking Thistype checker has had to be carefully engineered to ensure thedecidability of type checking To avoid the reliance on thisengineering our prior work work [2] introduced MiniSail akernel language for Sail and presented a paper formalisationof MiniSail that provided a number of guarantees about theSail type system including decidabilityDecidability of type checking arises from the following

First the typing rules are syntax directed This ensures thatthere is a deterministic choice when deciding which rule touse to type check a particular term Second we only send tothe SMT solver well-sorted problems in the quantifier-freelinear arithmetic logic fragment The problems are guaran-teed to be in this logic fragment through a combination oflimits on the syntax (eg we donrsquot have quantifiers in theconstraint syntax) and sort checkingWe review the choice of syntax the type system and the

operational semantics of MiniSail in Section 3This paper builds on the paper formalisation and describes

a mechanisation of MiniSail in the Isabelle theorem prover[12] This mechanisation provides a stronger guarantee thanthe paper proofs and gives a platform for the building of avalidated type checker and executable for MiniSail makinguse of Isabellersquos code generation features The mechanisationhas also uncovered a shortcoming in the paper proofs aroundthe area of sort checking (that we detail later)We make use of Isar the structured proof language of Is-

abelle which allows us to write proofs that resemble howpaper proofs look To address binding α-equivalence andcapture avoiding substitution we make use of Nominal logicas provided by the Nominal2 Isabelle [4] package The gram-mar of MiniSail is specified using Isabelle datatypes and thetyping and reduction semantics by inductive predicates Weprove the type safety property for MiniSail in the usual wayusing progress and preservation lemmas Progress tells usthat a statement is either a value or can be reduced to anotherstatement and preservation tells us that types are preservedacross reduction The proofs of preservation make exten-sive use of substitution lemmas which form the core of themechanisation We present the architecture of the Isabellemechanisation in Section 4

1

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

The contributions of this work are as followsbull We provide further guarantees of the soundness of theSail ISA modelling language by mechanising it in atheorem proverbull We show how a language with a light-weight depen-dent type system that uses an SMT solver as part oftyping checking can be formalised using Nominal Is-abellebull We argue that with some up front cost the Isar struc-tured proof language and Nominal Isabelle allow us towrite mechanised proofs that resemble proofs found inpaper formalisation and are understandable by thosewithout expert knowledge of theorem proversbull We show the benefits of the three way co-developmentof this mechanisation with the development of the pa-per proofs and the full Sail implementation presentedin [2]

2 SailWe give a flavour of the Sail language The Sail specificationof an instruction set describes how bit strings map to in-structions in the architecture and how these instructions areexecuted The execution of an instruction manipulates theinternal state of the CPU (such as the registers) and interactswith the memory accessible from the CPU

As expected a large part of ISA specifications involve themanipulation of bit vectors For example appending two bitvectors extending a bit vector with zeros or pattern match-ing on bit vectors to decode an instruction The instructionsets modelled go far beyond small research architectures andinclude commercial CPUs architectures that describe hun-dreds or thousands of instructions Keeping track of vectorlengths across all of the instructions is something that wouldassist the specification developer To support this Sail pro-vides a light-weight dependent types system which allowsthe specification of constraints on the lengths of bit vectorsand the range of integers

An example of a type that can be specified is range(032)which is the type of integers that fall between 0 and 32 Theseconstraints extend to function signatures with type variablesthat are universally quantified For example the type of afunction that appends two bit vectors isval append = forall (nInt)(mInt)

(bits(n)bits(m))-gtbits(n+m)

The n and m are integer type variables that specify thelength of the bit vectors and are used to constrain the lengthof the bit vector returned by the function Sail also makesuse of these refinement types in its definition of arithmeticoperators such as additionval add_atom = forall n m

(atom(n)atom(m)) -gt atom(n+m)

The type atom(n) is the type of integers that are equal ton and this is how program level values are lifted into type

level values and fed into the result type of this function Sailalso provides record types (labelled product types) and uniontypes (algebraic datatypes) The former is used to model theinternal states of the CPU and the latter is used to model theAST of the instruction set

Figure 1 shows an example Sail function The functiondouble_bits takes a bit vector of length m and produces abit vector of double the length with the bit pattern of theinput duplicated The signature of double_bits illustratesthe dependent type nature of Sail the length of the resultingvector is dependent on the length of the supplied vectorFurthermore the signature includes the constraint that thelength is at least 1 In the body of the function the mutablevariable ys is used to build the result vector It is initialisedto a vector of zeros with length equal to the result vectorlength Then in the foreach loop ys is first shifted left by thelength of xs and then these new bits are set to the bits fromxs The final line of the function returns ys Type checkingmust check at each assignment that the value being assignedto ys is a subtype of the original type of ys that was set whenit is first assigned In addition the type of the return valueys is checked against the return type of the function

$include ltpreludesailgt

infix 7 ltlt

val operator ltlt = shift_bits_left forall n m

(bits(n) atom(m)) -gt bits(n)

val or_vec forall n

(bits(n) bits(n)) -gt bits(n)

overload operator | = or_bool or_vec

val length forall n bits(n) -gt atom(n)

val zero_extend forall n m m gt= n

(bits(n) atom(m)) -gt bits(m)

val double_bits forall m m gt= 1

(bits(m)) -gt bits(m + m)

function double_bits(xs) =

let m = length(xs)

ys bits(m+m) = zeros(m+m)

foreach (i from 1 to 2)

ys = ys ltlt length(xs)

ys = ys | zero_extend(xs length(ys))

ys

Figure 1 Example Sail Function

We can not adopt a full dependent type system such asthat found in Agda[13] as we want the language to be un-derstandable by the hardware engineers who are used toclassical imperative languages In addition we want the typechecking to be decidable and this motivates a light weight

2

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

dependent type system based on liquid types [17] and us-ing an SMT solver to resolve constraints Even so liquidtypes are subtle and are easy to get wrong [21] This moti-vates a formalisation of the language and proofs that providesoundness guaranteesThe Sail language includes the usual constructs for an

imperative language if statements mutable variables as-signment and while loops Also included is pattern matchingin match statements and function arguments Pattern match-ing supports complex matching on bit patterns and this isvery useful when decoding bit patterns into instructions

As well as making it engineer friendly the imperativenature of language makes it possible to export to to C andOCaml to provide efficient execution of the model In thecase of the C export this performs sufficiently well thatan OS can be booted on this lsquovirtual CPUrsquo To support theformal validation of the model the specification can alsobe exported as theorem prover code for the Isabelle HOLand Coq theorem provers Instruction set models have beendeveloped for commercial CPUs such as ARM [1] as well asresearch architectures such as CHERI [22 23]A further motivation for a formal specification of Sail is

the following Instruction set architectures for commercialCPUs are long lived and evolve over decades and so if Sail isto be used to write these architectural models Sail shoulditself be stable over this time This stability could be achievedby freezing the language and its implementation but thiswouldmean that we can not take advantage of improvementsto SMT solvers or the implementation language A clearspecification for Sail that describes the language and typesystem would mitigate the risks associated with changes tothe implementation technology

To support the re-implementation of Sail we decided thatrather than attempt a specification of the full language wewould formalise a core-calculus of the language that containsthe key interesting features of Sail such as dependent-typesand imperative constructs This led us to MiniSail which isdescribed in the next section

3 MiniSail31 Kernel Language DesignBefore describing the syntax and semantics of MiniSail wemotivate our choice of Sail features that have been includedin MiniSail We have shown above that Sail has a depen-dent type system makes use of the Z3 SMT solver for typechecking and has some imperative language features suchas mutable variablesWe want the problems that we give to Z3 to be within a

decidable logic fragment for Z3 We choose as this fragmentquantifier-free linear arithmetic with algebraic datatypeswhich is decidable in Z3 The problems we pass to the solverare generated from the constraints within the types that pass

through the type checker Types arise from typing annota-tions in the MiniSail program or are synthesised by the typechecker We choose as values and expressions a representa-tive sample from Sail that will give us a variety of typingrules that generate types

Constraints in types can contain immutable variables butnot mutable variables so we need to ensure that synthesisedtypes will not contain mutable variables Also we need to il-lustrate that type checking for mutable variable binding andusage is sound as well so we include some of the imperativefeatures of Sail into MiniSail

Liquid typing [17] that is used in Sail provides flow typingso we choose to include if andmatch statements in MiniSaillanguage and their associated flow typing rules

32 SyntaxMiniSail is intended to be a calculus for Sail rather than a pro-gramming language and has been designed for mathematicalrather than programming convenience To achieve this thesyntax of MiniSail is based on let-normal form and a strat-ification of what are expressions in the Sail language intovalues expressions and statements in MiniSail This leadsto a corresponding stratification of the typing judgementsand makes reasoning about the type system clearer Thegrammar for values expressions and statements is shown inFigure 2 along with other constructs in the language

In the grammarwemake use of the followingmeta-symbols

bull n - Numeric literalsbull x - Immutable program variablesbull u - Mutable program variablesbull z - Variables bound in refinement typesbull f - Function namesbull tid - Type identifiers for datatypesbull C - Datatype constructors

We can motivate the assignment of a strata to a term asfollows Values include the things that we want the reductionof a program to produce and the things stored in mutablevariables As values we have number Boolean and bit vec-tor literals variables pairing of values construction of adatatype instance and the unit valueExpressions are how we obtain new values from old and

are function application application of the binary opera-tions + and le projection from a pair mutable variablesconcatenation of bit vectors and calculation of the length ofa bit vector Expressions can be formed from these operationsapplied to values only and so a complex expression such asa function application to the sum of two variables needs tobe written as a nesting of let-statements where the parts ofthe complex expression are bound to intermediate variablesThis ensures that the types of the components of complexexpressions are made explicit since the typing of the nestedlet-statements will give us the type of each sub-expression

3

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

We have two sorts of binding statements for variables Forimmutable variables we have the conventional let statementlet x = e in s that binds e to x in s and for mutable variableswe have var u τ = v in s Operationally this allocates aslot in the mutable variable store tou and putsv into the slotThe value can be accessed usingmutable variable expressionsin the statement s and subsequent assignments to u in s willupdate the value stored for u

In the previous section we introduced Sail types and howthey include type level variables and constraints on thesevariables For MiniSail we use a different style of syntaxfor types that is more in keeping with the Liquid type para-digm The form for a MiniSail type is z b | ϕ where z isthe value variable b is the base type and ϕ the refinementconstraint The base type is the universe from which valuesof the type will come from and the refinement constraintis a logical predicate that can reference the value variableand immutable program variables that are within scope Forexample z int | 0 le z is the type for all integers greaterthan or equal to zero The refinement constraint can includeexpressions However to ensure that the constraints fallwithin a logical fragment that is decidable in Z3 we excludefrom the expression syntax for constraints function applica-tion and mutable variable access types can only depend onimmutable variables This restricted form for expressions isindicated by eminus in the grammar

Within the language are constructs for declaring the typeof a function and for binding a function The argument vari-able for the function is given explicitly in the signature andit can appear in the refinement constraint for the argumentand the type for the functionrsquos return value We also havethe means to define new datatypes by associating a typeidentifier to a list of pairs C τ where C is a constructor andτ is the type that the constructor needs to be applied to givean instance of the datatype

33 Typing Judgements and ContextsThe MiniSail type system is described using a set of bidirec-tional [7] typing judgements For values and expressions theprincipal judgement is type synthesis which has the form_ ⊢ _rArr τ For statements the judgement is type checkingwhich has the form _ ⊢ _ lArr τ The left placeholder standsfor one or more contexts and the right placeholder by theterm Where we do need to check the type of an expressionor a value for example in function application we performtype synthesis and then a check that the synthesised typeis a subtype of the required type The key judgements andcontexts are listed in Figure 3Typing contexts are the structures that provide informa-

tion about variables and other names that might appear inthe terms that we are checking and importantly that mightappear in types Different sorts of information comes intoscope or moves out of scope at different times during the

type checking of a program and so we divide this informationinto four structures

bull For type definitions Θ This is a map from type iden-tifiers to datatype definitions This context remainsstatic during the evaluation of a program and so wewill not need lemmas relating to changes to the con-text such as weakening The well-scoping conditionsfor datatypes ensures that the definitions have no freevariables and so we donrsquot need to define a substitutionfunction for this contextbull For function definitionsΦ This is a map from functionnames to function definitions This shares the sameproperties as the Θ context given above Function bod-ies can reference only the immutable variable providedby the input argument functions anywhere in Φ anddatatype identifiers and constructors provided by a Θcontextbull For immutable variables Γ This is an ordered list oftriples of the form x b[ϕ] where x is an immutablevariable b is a base type and ϕ a constraint The con-straint can reference x and any other immutable vari-able given prior in the context as well as datatypeidentifiers and constructors provided by a Θ contextWe define a substitution function for this context andthe application of substitution to a Γ can result in thelsquocollapsersquo of the contextbull For mutable variables ∆ This is a map from a mutablevariable to a type The type does not have to closedand can reference immutable variables in the Γ contextand constructors from Θ

34 Sort CheckingTypes are specified in program text or built up using thetype synthesis rules At subtype checks the two types be-ing checked are used to build an SMT problem that is thenpassed to the solver We need to provide a guarantee that theconstraints within these types lead us to a problem statementthat falls within a decidable fragment of the SMT logic Thisfragment is quantifier-free linear arithmetic with algebraicdatatypes it is multi-sorted and is decidable As we will showlater in an example immutable variables appearing in thecontext and in types are mapped to variables in the SMTproblem and assigned a type in the SMT space We need toensure that variables are assigned valid types and that theproblem statement we construct sort checks

The sort checking judgement for expressions isΦΘ Γ∆ ⊢be b This says that e has base type (sort) b in the contexts onthe left hand side A small example of the sort checking rulesare shown in Figure 4 When expressions are compared in aconstraint we require that they both have the same sort andimportantly that they sort check in the context of emptyΦ and ∆ contexts This ensures that function applicationand mutable variables will not appear in the expressions

4

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

Value v = x | n | T | F | (vv ) | C v | ()Expression e = v | v +v | v le v | f v | u | fst v | snd v | len v | concat v v

Statement s = v | let x = e in s | let x τ = s in s| if v then s else s || match v C1 x1 rArr s1 Cn xn rArr sn | var u τ = v in s | u = v| while (s1) do s2

Base type b = int | bool | unit | bitvec | tid | b times bConstraint ϕ = T| F| eminus = eminus | eminus le eminus | ϕ and ϕ | ϕ or ϕ | ϕ rArr ϕ | notϕRefinement Type τ = z b | ϕ

Function Definition fd = val f (x b[ϕ]) rarr τfunction f (x ) = s

Datatype Definition td = union tid = C1 τ1 Cn τn Definition def = td | fdProgram p = def1 defn s

Figure 2 MiniSail Grammar

Φ Function definition context Θ Type definition contextΓ Immutable variable context ∆ Mutable variable context⊢b Θ Sort checking Θ Θ Γ ⊢b ∆ Sort checking ∆Θ ⊢b Φ Sort checking Φ Θ ⊢b Γ Sort checking Γ

Θ Γ ⊢b v b Sort checking values ΦΘ Γ∆ ⊢b e b Sort checking expressionsΘ Γ ⊢b ϕ Sort checking constraints Θ Γ ⊢b τ Sort checking types

Θ Γ ⊢ v rArr τ Type synthesis values Θ Γ ⊢ v lArr τ Type checking valuesΦ∆ ΓΘ ⊢ e rArr τ Type synthesis expressions Φ∆ ΓΘ ⊢ s lArr τ Type checking statementsΘ Γ ⊢ τ1 ≲ τ2 Subtyping Θ Γ |= ϕ Validity

Figure 3 MiniSail Contexts and Judgements

Sort checking will also ensure that for example the twoarguments to an addition operation have the int sortSort checking rules were added late in the development

of the mechanisation and replace and subsume the well-scoping rules that are used in the paper formalisation Thewell-scoping rules were only checking that for examplevariables appearing in a term are present in the Γ contextand were not checking that the base type of the variable wascompatible with the term it was appearing in

35 Subtyping and ValidityThe subtyping relation is Θ Γ ⊢ τ1 ≲ τ2 and this breaksdown to a validity relation Θ Γ |= ϕ1 minusrarr ϕ2 where ϕ1 andϕ2 are the refinement predicates for the types τ1 and τ2 TheSMT solver is used to check the validity relation

We will illustrate how subtyping leads to an SMT problemwith an extended example Suppose we have a function withthe signature

val f (x int times int [0 le fst x and 0 le snd x]) rarrz int | fst x le z and snd x le z

and that we need to prove the following typing judgementfor the application of this function to the pair of variables(ab)

Φ∆a int [a = 9]b int [b = 10]Θ ⊢ f (ab) rArr τ

where τ is z int | a le z and b le zThe derivation of the judgement will include this subtyp-

ing check

Θa int [a = 9]b int [b = 10] ⊢z int times int |z = (ab) ≲

x int times int |0 le fst x and 0 le snd x

that gives this validity checking judgement

Θa int [a = 9]b int [b = 10] z int times int [z = (ab)] |=0 le fst z and 0 le snd z

which is equivalent to checking the validity of this logicalexpression

a = 9 and b = 10 and z = (ab) minusrarr 0 le fst z and 0 le snd z

5

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

To check this we generate the following Z3 script where weask Z3 to check the satisfiability of the negated goal

(declare-datatypes (T1 T2)

((Pair (mk-pair (fst T1) (snd T2)))))

(declare-const a Int)

(declare-const b Int)

(declare-const z (Pair Int Int))

(define-fun constraint () Bool (and (= a 9) (= b 10)

(= z (mk-pair a b)) (not (and ( lt= 0 (fst z ))

( lt= 0 (snd z))))))

(assert constraint)

(check-sat)

The first line is a datatype declaration for a pair type Ifour context had included variables for a datatype then adeclaration for this type would appear here The next threelines declare the variables used one declaration for eachvariable in the context The Z3 type for the variable is derivedfrom theMiniSail base type for the variable as per the contextThe final three lines define the goal and request Z3 to checkthat the goal is satisfiable or not If the goal is unsatisfiablethen it means that the subtype check has succeeded

36 Typing RulesA MiniSail program is composed of a set of datatype defini-tions a set of function definitions and a single statement Tocheck a program we will sort check the datatype and thefunction definitions We will also check that the functionstype-check meaning that a function body has the type indi-cated by the functionrsquos return type in a context containingjust the functions argument and type Typing checking of astatement cascades down the statementrsquos structure and atpoints where checking of expressions or variables occurs itflips over to type synthesis with subtyping checks occurringat the change over points The important typing rules areshow in Figure 5

For all of the rules if a context appears in the rule conclu-sion but not in a typing premise then we need to include asort checking judgement as a premise For example in rule8 we include sort checking for ∆ as we are introducing the∆ context Sort checking of ∆ will sort check the types in ∆ensure that all free variables in the types are in the contextΓ and that all data constructors are in ΘFor the type synthesis rules for values we only require

the Θ and Γ contexts and the form for the type synthesisedfor a value v is always z b |z = v We are able to includev in the constraint as the rules check that v is well-sortedfor the base cases literals are automatically well-sorted andvariables are well sorted if they are in a well-sorted Γ context

As mutable variables and function applications are in-cluded as expressions we include the contexts ∆ and ΦThe type synthesised can not have the form z b |z = e

as not all expressions in our language are valid constraint-expressions For mutable variables the type synthesised isthe type of the variable as recorded in the ∆ context andfor function application expressions the type synthesisedis the return type of the function with substitution of theargument value into the return type

In a number of the rules for example function application(12) mutable variable assignment (16) and while loops (20)we are required to check that the variable or expression hasa particular type so we have a single type checking rule forvalues (rule 23) and for expressions (rule 24)

For statements we have just a checking judgement Thestatement var u τ = v in s that introduces and scopesmutable variables has a typing rule where we check v isa subtype of τ and that s has the required type with u τadded to the mutable variable context ∆ For assignmentstatements u = v we require the type of these to be unitand check that the type of v is a subtype of τ where u τ isan entry in ∆Typing for the if and thematch statement incorporates

flow typing This is where the predicate for the type beingchecked for in the branches is checked under the assumptionthat the discriminator has the value associated to the branchFor example in the positive branch of the if statement (rule19) we type check against a type that has the constraint v =T and (ϕ1[vx]) =rArr (ϕ[z1z]) For the match statement (rule21) we use a slightly different form Since we are binding anew variable xi in each branch we add this variable to thecontext with the constraintϕi [xizi ]andv = Ci xi While loopsare straight-forward - the condition needs to be a Booleanand the body the unit We donrsquot attempt flow typing for whileloops One difficulty is that the discriminator for the whilestatement is a statement it is not obvious how to includea constraint on this in the type checking of the positivebranch (while statement body) and the negative branch (thestatements after the while loop)

37 Operational SemanticsWe define the operational semantics using a small-step tran-sition relation

Φ ⊢ ⟨δ s⟩ minusrarr ⟨δ prime s prime⟩

Where δ is themutable store that maps frommutable variablenames to values and gives the current value of a mutablevariable For immutable variables we rely on the substitutionfunction to replace immutable variables with the value fromthe variablersquos binding statementWe now outline how the transition relation works on

different forms of statements For let statements let x =v in s we have a transition step case for each form for e Ife is a value then the reduction is to ⟨δ s[vx]⟩ For the casewhere e is the application of a binary operator and the twoarguments are integers then we will perform the calculationand the resulting statement is s with the e replaced with

6

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

Θ Γ ⊢b ∆Θ ⊢b ΦΘ Γ ⊢b v bval f (x b[ϕ]) rarr τ isin Φ

ΦΘ Γ∆ ⊢b f v b(τ )2

ϵ Θ Γ ϵ ⊢b e1 bϵ Θ Γ ϵ ⊢b e2 bΘ Γ ⊢b e1 = e2

1

Figure 4MiniSail Sort Checking

Θ ⊢b Γ

Θ Γ ⊢ n rArr z int|z = n1

Θ ⊢b Γ

Θ Γ ⊢ TrArr z bool|z = T2

Θ ⊢b Γ

Θ Γ ⊢ FrArr z bool|z = F3

Θ ⊢b Γ

Θ Γ ⊢ () rArr z unit|z = ()4

Θ ⊢b Γx b[ϕ] isin Γ

Θ Γ ⊢ x rArr z b|z = x5

Θ Γ ⊢ v1 rArr z1 b1 |ϕ1

Θ Γ ⊢ v2 rArr z2 b2 |ϕ2

Θ Γ ⊢ (v1 v2) rArr z b1 lowast b2 |z = (v1 v2)6

union tid = Ci τii isin Θ

Θ Γ ⊢ v lArr τ

Θ Γ ⊢ Cj v rArr z tid |z = Cj v7

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v1 rArr z1 int|ϕ1

Θ Γ ⊢ v2 rArr z2 int|ϕ2

Φ∆ ΓΘ ⊢ v1 + v2 rArr z3 int|z3 = v1 + v28

Θ ⊢b ΦΘ Γ ⊢b ∆val f (x b[ϕ]) rarr τ isin ΦΘ Γ ⊢ v lArr z b|ϕΦ∆ ΓΘ ⊢ f v rArr τ [vx]

9

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆

Φ∆ ΓΘ ⊢ u rArr τ10

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ fst v rArr z b1 |z = fst v11

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ snd v rArr z b2 |z = snd v12

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ v lArr τ13

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ14

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ15

u lt dom(δ )Θ Γ ⊢ v lArr τΦ∆ u τ ΓΘ ⊢ s lArr τ2

Φ∆ ΓΘ ⊢ var u τ = v in s lArr τ216

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ u = v lArr z unit|⊤17

Φ∆ ΓΘ ⊢ s1 lArr z unit|⊤Φ∆ ΓΘ ⊢ s2 lArr τ

Φ∆ ΓΘ ⊢ s1 s2 lArr τ18

Θ Γ ⊢ v rArr x bool|ϕ1

Φ∆ ΓΘ ⊢ s1 lArr z1 b|(v = T and (ϕ1[vx])) =rArr (ϕ[z1z])Φ∆ ΓΘ ⊢ s2 lArr z2 b|(v = F and (ϕ1[vx])) =rArr (ϕ[z2z])

Φ∆ ΓΘ ⊢ if v then s1 else s2 lArr z b|ϕ19

Φ∆ ΓΘ ⊢ s1 lArr z bool|⊤Φ∆ ΓΘ ⊢ s2 lArr z unit|⊤

Φ∆ ΓΘ ⊢ while (s1) do s2 lArr z unit|⊤20

union tid = Ci zi bi |ϕii isin Θ

Θ Γ ⊢ v rArr z tid |ϕΦ∆ Γ xi bi[ϕi[xizi] and v = Ci xi and (ϕ[vz])]Θ ⊢ si lArr τ

i

Φ∆ ΓΘ ⊢ match v of Ci xi rArr siilArr τ

21

Θ Γ ⊢b z1 b|ϕ1

Θ Γ ⊢b z2 b|ϕ2

P Γ z3 b[ϕ1[z3z1]] |= ϕ2[z3z1]Θ Γ ⊢ z1 b|ϕ1 ≲ z2 b|ϕ2

22

Θ Γ ⊢ v rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Θ Γ ⊢ v lArr z1 b|ϕ123

Φ∆ ΓΘ ⊢ e rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Φ∆ ΓΘ ⊢ e lArr z1 b|ϕ124

Figure 5MiniSail Type System

the result of the operator For the case where e is fst (v1v2)or snd (v1v2) then the resulting statement is s with thee replaced by v1 or v2 For the case where e is a functionapplication f v then the reduction is to let x τ [vy] =

s prime[vy] in s where s prime is the body of the function f y is thefunction argument and τ is the return type of the functionWe substitute the value the function is applied to into boththe return type and the body of the function s prime This means

7

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 2: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

The contributions of this work are as followsbull We provide further guarantees of the soundness of theSail ISA modelling language by mechanising it in atheorem proverbull We show how a language with a light-weight depen-dent type system that uses an SMT solver as part oftyping checking can be formalised using Nominal Is-abellebull We argue that with some up front cost the Isar struc-tured proof language and Nominal Isabelle allow us towrite mechanised proofs that resemble proofs found inpaper formalisation and are understandable by thosewithout expert knowledge of theorem proversbull We show the benefits of the three way co-developmentof this mechanisation with the development of the pa-per proofs and the full Sail implementation presentedin [2]

2 SailWe give a flavour of the Sail language The Sail specificationof an instruction set describes how bit strings map to in-structions in the architecture and how these instructions areexecuted The execution of an instruction manipulates theinternal state of the CPU (such as the registers) and interactswith the memory accessible from the CPU

As expected a large part of ISA specifications involve themanipulation of bit vectors For example appending two bitvectors extending a bit vector with zeros or pattern match-ing on bit vectors to decode an instruction The instructionsets modelled go far beyond small research architectures andinclude commercial CPUs architectures that describe hun-dreds or thousands of instructions Keeping track of vectorlengths across all of the instructions is something that wouldassist the specification developer To support this Sail pro-vides a light-weight dependent types system which allowsthe specification of constraints on the lengths of bit vectorsand the range of integers

An example of a type that can be specified is range(032)which is the type of integers that fall between 0 and 32 Theseconstraints extend to function signatures with type variablesthat are universally quantified For example the type of afunction that appends two bit vectors isval append = forall (nInt)(mInt)

(bits(n)bits(m))-gtbits(n+m)

The n and m are integer type variables that specify thelength of the bit vectors and are used to constrain the lengthof the bit vector returned by the function Sail also makesuse of these refinement types in its definition of arithmeticoperators such as additionval add_atom = forall n m

(atom(n)atom(m)) -gt atom(n+m)

The type atom(n) is the type of integers that are equal ton and this is how program level values are lifted into type

level values and fed into the result type of this function Sailalso provides record types (labelled product types) and uniontypes (algebraic datatypes) The former is used to model theinternal states of the CPU and the latter is used to model theAST of the instruction set

Figure 1 shows an example Sail function The functiondouble_bits takes a bit vector of length m and produces abit vector of double the length with the bit pattern of theinput duplicated The signature of double_bits illustratesthe dependent type nature of Sail the length of the resultingvector is dependent on the length of the supplied vectorFurthermore the signature includes the constraint that thelength is at least 1 In the body of the function the mutablevariable ys is used to build the result vector It is initialisedto a vector of zeros with length equal to the result vectorlength Then in the foreach loop ys is first shifted left by thelength of xs and then these new bits are set to the bits fromxs The final line of the function returns ys Type checkingmust check at each assignment that the value being assignedto ys is a subtype of the original type of ys that was set whenit is first assigned In addition the type of the return valueys is checked against the return type of the function

$include ltpreludesailgt

infix 7 ltlt

val operator ltlt = shift_bits_left forall n m

(bits(n) atom(m)) -gt bits(n)

val or_vec forall n

(bits(n) bits(n)) -gt bits(n)

overload operator | = or_bool or_vec

val length forall n bits(n) -gt atom(n)

val zero_extend forall n m m gt= n

(bits(n) atom(m)) -gt bits(m)

val double_bits forall m m gt= 1

(bits(m)) -gt bits(m + m)

function double_bits(xs) =

let m = length(xs)

ys bits(m+m) = zeros(m+m)

foreach (i from 1 to 2)

ys = ys ltlt length(xs)

ys = ys | zero_extend(xs length(ys))

ys

Figure 1 Example Sail Function

We can not adopt a full dependent type system such asthat found in Agda[13] as we want the language to be un-derstandable by the hardware engineers who are used toclassical imperative languages In addition we want the typechecking to be decidable and this motivates a light weight

2

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

dependent type system based on liquid types [17] and us-ing an SMT solver to resolve constraints Even so liquidtypes are subtle and are easy to get wrong [21] This moti-vates a formalisation of the language and proofs that providesoundness guaranteesThe Sail language includes the usual constructs for an

imperative language if statements mutable variables as-signment and while loops Also included is pattern matchingin match statements and function arguments Pattern match-ing supports complex matching on bit patterns and this isvery useful when decoding bit patterns into instructions

As well as making it engineer friendly the imperativenature of language makes it possible to export to to C andOCaml to provide efficient execution of the model In thecase of the C export this performs sufficiently well thatan OS can be booted on this lsquovirtual CPUrsquo To support theformal validation of the model the specification can alsobe exported as theorem prover code for the Isabelle HOLand Coq theorem provers Instruction set models have beendeveloped for commercial CPUs such as ARM [1] as well asresearch architectures such as CHERI [22 23]A further motivation for a formal specification of Sail is

the following Instruction set architectures for commercialCPUs are long lived and evolve over decades and so if Sail isto be used to write these architectural models Sail shoulditself be stable over this time This stability could be achievedby freezing the language and its implementation but thiswouldmean that we can not take advantage of improvementsto SMT solvers or the implementation language A clearspecification for Sail that describes the language and typesystem would mitigate the risks associated with changes tothe implementation technology

To support the re-implementation of Sail we decided thatrather than attempt a specification of the full language wewould formalise a core-calculus of the language that containsthe key interesting features of Sail such as dependent-typesand imperative constructs This led us to MiniSail which isdescribed in the next section

3 MiniSail31 Kernel Language DesignBefore describing the syntax and semantics of MiniSail wemotivate our choice of Sail features that have been includedin MiniSail We have shown above that Sail has a depen-dent type system makes use of the Z3 SMT solver for typechecking and has some imperative language features suchas mutable variablesWe want the problems that we give to Z3 to be within a

decidable logic fragment for Z3 We choose as this fragmentquantifier-free linear arithmetic with algebraic datatypeswhich is decidable in Z3 The problems we pass to the solverare generated from the constraints within the types that pass

through the type checker Types arise from typing annota-tions in the MiniSail program or are synthesised by the typechecker We choose as values and expressions a representa-tive sample from Sail that will give us a variety of typingrules that generate types

Constraints in types can contain immutable variables butnot mutable variables so we need to ensure that synthesisedtypes will not contain mutable variables Also we need to il-lustrate that type checking for mutable variable binding andusage is sound as well so we include some of the imperativefeatures of Sail into MiniSail

Liquid typing [17] that is used in Sail provides flow typingso we choose to include if andmatch statements in MiniSaillanguage and their associated flow typing rules

32 SyntaxMiniSail is intended to be a calculus for Sail rather than a pro-gramming language and has been designed for mathematicalrather than programming convenience To achieve this thesyntax of MiniSail is based on let-normal form and a strat-ification of what are expressions in the Sail language intovalues expressions and statements in MiniSail This leadsto a corresponding stratification of the typing judgementsand makes reasoning about the type system clearer Thegrammar for values expressions and statements is shown inFigure 2 along with other constructs in the language

In the grammarwemake use of the followingmeta-symbols

bull n - Numeric literalsbull x - Immutable program variablesbull u - Mutable program variablesbull z - Variables bound in refinement typesbull f - Function namesbull tid - Type identifiers for datatypesbull C - Datatype constructors

We can motivate the assignment of a strata to a term asfollows Values include the things that we want the reductionof a program to produce and the things stored in mutablevariables As values we have number Boolean and bit vec-tor literals variables pairing of values construction of adatatype instance and the unit valueExpressions are how we obtain new values from old and

are function application application of the binary opera-tions + and le projection from a pair mutable variablesconcatenation of bit vectors and calculation of the length ofa bit vector Expressions can be formed from these operationsapplied to values only and so a complex expression such asa function application to the sum of two variables needs tobe written as a nesting of let-statements where the parts ofthe complex expression are bound to intermediate variablesThis ensures that the types of the components of complexexpressions are made explicit since the typing of the nestedlet-statements will give us the type of each sub-expression

3

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

We have two sorts of binding statements for variables Forimmutable variables we have the conventional let statementlet x = e in s that binds e to x in s and for mutable variableswe have var u τ = v in s Operationally this allocates aslot in the mutable variable store tou and putsv into the slotThe value can be accessed usingmutable variable expressionsin the statement s and subsequent assignments to u in s willupdate the value stored for u

In the previous section we introduced Sail types and howthey include type level variables and constraints on thesevariables For MiniSail we use a different style of syntaxfor types that is more in keeping with the Liquid type para-digm The form for a MiniSail type is z b | ϕ where z isthe value variable b is the base type and ϕ the refinementconstraint The base type is the universe from which valuesof the type will come from and the refinement constraintis a logical predicate that can reference the value variableand immutable program variables that are within scope Forexample z int | 0 le z is the type for all integers greaterthan or equal to zero The refinement constraint can includeexpressions However to ensure that the constraints fallwithin a logical fragment that is decidable in Z3 we excludefrom the expression syntax for constraints function applica-tion and mutable variable access types can only depend onimmutable variables This restricted form for expressions isindicated by eminus in the grammar

Within the language are constructs for declaring the typeof a function and for binding a function The argument vari-able for the function is given explicitly in the signature andit can appear in the refinement constraint for the argumentand the type for the functionrsquos return value We also havethe means to define new datatypes by associating a typeidentifier to a list of pairs C τ where C is a constructor andτ is the type that the constructor needs to be applied to givean instance of the datatype

33 Typing Judgements and ContextsThe MiniSail type system is described using a set of bidirec-tional [7] typing judgements For values and expressions theprincipal judgement is type synthesis which has the form_ ⊢ _rArr τ For statements the judgement is type checkingwhich has the form _ ⊢ _ lArr τ The left placeholder standsfor one or more contexts and the right placeholder by theterm Where we do need to check the type of an expressionor a value for example in function application we performtype synthesis and then a check that the synthesised typeis a subtype of the required type The key judgements andcontexts are listed in Figure 3Typing contexts are the structures that provide informa-

tion about variables and other names that might appear inthe terms that we are checking and importantly that mightappear in types Different sorts of information comes intoscope or moves out of scope at different times during the

type checking of a program and so we divide this informationinto four structures

bull For type definitions Θ This is a map from type iden-tifiers to datatype definitions This context remainsstatic during the evaluation of a program and so wewill not need lemmas relating to changes to the con-text such as weakening The well-scoping conditionsfor datatypes ensures that the definitions have no freevariables and so we donrsquot need to define a substitutionfunction for this contextbull For function definitionsΦ This is a map from functionnames to function definitions This shares the sameproperties as the Θ context given above Function bod-ies can reference only the immutable variable providedby the input argument functions anywhere in Φ anddatatype identifiers and constructors provided by a Θcontextbull For immutable variables Γ This is an ordered list oftriples of the form x b[ϕ] where x is an immutablevariable b is a base type and ϕ a constraint The con-straint can reference x and any other immutable vari-able given prior in the context as well as datatypeidentifiers and constructors provided by a Θ contextWe define a substitution function for this context andthe application of substitution to a Γ can result in thelsquocollapsersquo of the contextbull For mutable variables ∆ This is a map from a mutablevariable to a type The type does not have to closedand can reference immutable variables in the Γ contextand constructors from Θ

34 Sort CheckingTypes are specified in program text or built up using thetype synthesis rules At subtype checks the two types be-ing checked are used to build an SMT problem that is thenpassed to the solver We need to provide a guarantee that theconstraints within these types lead us to a problem statementthat falls within a decidable fragment of the SMT logic Thisfragment is quantifier-free linear arithmetic with algebraicdatatypes it is multi-sorted and is decidable As we will showlater in an example immutable variables appearing in thecontext and in types are mapped to variables in the SMTproblem and assigned a type in the SMT space We need toensure that variables are assigned valid types and that theproblem statement we construct sort checks

The sort checking judgement for expressions isΦΘ Γ∆ ⊢be b This says that e has base type (sort) b in the contexts onthe left hand side A small example of the sort checking rulesare shown in Figure 4 When expressions are compared in aconstraint we require that they both have the same sort andimportantly that they sort check in the context of emptyΦ and ∆ contexts This ensures that function applicationand mutable variables will not appear in the expressions

4

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

Value v = x | n | T | F | (vv ) | C v | ()Expression e = v | v +v | v le v | f v | u | fst v | snd v | len v | concat v v

Statement s = v | let x = e in s | let x τ = s in s| if v then s else s || match v C1 x1 rArr s1 Cn xn rArr sn | var u τ = v in s | u = v| while (s1) do s2

Base type b = int | bool | unit | bitvec | tid | b times bConstraint ϕ = T| F| eminus = eminus | eminus le eminus | ϕ and ϕ | ϕ or ϕ | ϕ rArr ϕ | notϕRefinement Type τ = z b | ϕ

Function Definition fd = val f (x b[ϕ]) rarr τfunction f (x ) = s

Datatype Definition td = union tid = C1 τ1 Cn τn Definition def = td | fdProgram p = def1 defn s

Figure 2 MiniSail Grammar

Φ Function definition context Θ Type definition contextΓ Immutable variable context ∆ Mutable variable context⊢b Θ Sort checking Θ Θ Γ ⊢b ∆ Sort checking ∆Θ ⊢b Φ Sort checking Φ Θ ⊢b Γ Sort checking Γ

Θ Γ ⊢b v b Sort checking values ΦΘ Γ∆ ⊢b e b Sort checking expressionsΘ Γ ⊢b ϕ Sort checking constraints Θ Γ ⊢b τ Sort checking types

Θ Γ ⊢ v rArr τ Type synthesis values Θ Γ ⊢ v lArr τ Type checking valuesΦ∆ ΓΘ ⊢ e rArr τ Type synthesis expressions Φ∆ ΓΘ ⊢ s lArr τ Type checking statementsΘ Γ ⊢ τ1 ≲ τ2 Subtyping Θ Γ |= ϕ Validity

Figure 3 MiniSail Contexts and Judgements

Sort checking will also ensure that for example the twoarguments to an addition operation have the int sortSort checking rules were added late in the development

of the mechanisation and replace and subsume the well-scoping rules that are used in the paper formalisation Thewell-scoping rules were only checking that for examplevariables appearing in a term are present in the Γ contextand were not checking that the base type of the variable wascompatible with the term it was appearing in

35 Subtyping and ValidityThe subtyping relation is Θ Γ ⊢ τ1 ≲ τ2 and this breaksdown to a validity relation Θ Γ |= ϕ1 minusrarr ϕ2 where ϕ1 andϕ2 are the refinement predicates for the types τ1 and τ2 TheSMT solver is used to check the validity relation

We will illustrate how subtyping leads to an SMT problemwith an extended example Suppose we have a function withthe signature

val f (x int times int [0 le fst x and 0 le snd x]) rarrz int | fst x le z and snd x le z

and that we need to prove the following typing judgementfor the application of this function to the pair of variables(ab)

Φ∆a int [a = 9]b int [b = 10]Θ ⊢ f (ab) rArr τ

where τ is z int | a le z and b le zThe derivation of the judgement will include this subtyp-

ing check

Θa int [a = 9]b int [b = 10] ⊢z int times int |z = (ab) ≲

x int times int |0 le fst x and 0 le snd x

that gives this validity checking judgement

Θa int [a = 9]b int [b = 10] z int times int [z = (ab)] |=0 le fst z and 0 le snd z

which is equivalent to checking the validity of this logicalexpression

a = 9 and b = 10 and z = (ab) minusrarr 0 le fst z and 0 le snd z

5

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

To check this we generate the following Z3 script where weask Z3 to check the satisfiability of the negated goal

(declare-datatypes (T1 T2)

((Pair (mk-pair (fst T1) (snd T2)))))

(declare-const a Int)

(declare-const b Int)

(declare-const z (Pair Int Int))

(define-fun constraint () Bool (and (= a 9) (= b 10)

(= z (mk-pair a b)) (not (and ( lt= 0 (fst z ))

( lt= 0 (snd z))))))

(assert constraint)

(check-sat)

The first line is a datatype declaration for a pair type Ifour context had included variables for a datatype then adeclaration for this type would appear here The next threelines declare the variables used one declaration for eachvariable in the context The Z3 type for the variable is derivedfrom theMiniSail base type for the variable as per the contextThe final three lines define the goal and request Z3 to checkthat the goal is satisfiable or not If the goal is unsatisfiablethen it means that the subtype check has succeeded

36 Typing RulesA MiniSail program is composed of a set of datatype defini-tions a set of function definitions and a single statement Tocheck a program we will sort check the datatype and thefunction definitions We will also check that the functionstype-check meaning that a function body has the type indi-cated by the functionrsquos return type in a context containingjust the functions argument and type Typing checking of astatement cascades down the statementrsquos structure and atpoints where checking of expressions or variables occurs itflips over to type synthesis with subtyping checks occurringat the change over points The important typing rules areshow in Figure 5

For all of the rules if a context appears in the rule conclu-sion but not in a typing premise then we need to include asort checking judgement as a premise For example in rule8 we include sort checking for ∆ as we are introducing the∆ context Sort checking of ∆ will sort check the types in ∆ensure that all free variables in the types are in the contextΓ and that all data constructors are in ΘFor the type synthesis rules for values we only require

the Θ and Γ contexts and the form for the type synthesisedfor a value v is always z b |z = v We are able to includev in the constraint as the rules check that v is well-sortedfor the base cases literals are automatically well-sorted andvariables are well sorted if they are in a well-sorted Γ context

As mutable variables and function applications are in-cluded as expressions we include the contexts ∆ and ΦThe type synthesised can not have the form z b |z = e

as not all expressions in our language are valid constraint-expressions For mutable variables the type synthesised isthe type of the variable as recorded in the ∆ context andfor function application expressions the type synthesisedis the return type of the function with substitution of theargument value into the return type

In a number of the rules for example function application(12) mutable variable assignment (16) and while loops (20)we are required to check that the variable or expression hasa particular type so we have a single type checking rule forvalues (rule 23) and for expressions (rule 24)

For statements we have just a checking judgement Thestatement var u τ = v in s that introduces and scopesmutable variables has a typing rule where we check v isa subtype of τ and that s has the required type with u τadded to the mutable variable context ∆ For assignmentstatements u = v we require the type of these to be unitand check that the type of v is a subtype of τ where u τ isan entry in ∆Typing for the if and thematch statement incorporates

flow typing This is where the predicate for the type beingchecked for in the branches is checked under the assumptionthat the discriminator has the value associated to the branchFor example in the positive branch of the if statement (rule19) we type check against a type that has the constraint v =T and (ϕ1[vx]) =rArr (ϕ[z1z]) For the match statement (rule21) we use a slightly different form Since we are binding anew variable xi in each branch we add this variable to thecontext with the constraintϕi [xizi ]andv = Ci xi While loopsare straight-forward - the condition needs to be a Booleanand the body the unit We donrsquot attempt flow typing for whileloops One difficulty is that the discriminator for the whilestatement is a statement it is not obvious how to includea constraint on this in the type checking of the positivebranch (while statement body) and the negative branch (thestatements after the while loop)

37 Operational SemanticsWe define the operational semantics using a small-step tran-sition relation

Φ ⊢ ⟨δ s⟩ minusrarr ⟨δ prime s prime⟩

Where δ is themutable store that maps frommutable variablenames to values and gives the current value of a mutablevariable For immutable variables we rely on the substitutionfunction to replace immutable variables with the value fromthe variablersquos binding statementWe now outline how the transition relation works on

different forms of statements For let statements let x =v in s we have a transition step case for each form for e Ife is a value then the reduction is to ⟨δ s[vx]⟩ For the casewhere e is the application of a binary operator and the twoarguments are integers then we will perform the calculationand the resulting statement is s with the e replaced with

6

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

Θ Γ ⊢b ∆Θ ⊢b ΦΘ Γ ⊢b v bval f (x b[ϕ]) rarr τ isin Φ

ΦΘ Γ∆ ⊢b f v b(τ )2

ϵ Θ Γ ϵ ⊢b e1 bϵ Θ Γ ϵ ⊢b e2 bΘ Γ ⊢b e1 = e2

1

Figure 4MiniSail Sort Checking

Θ ⊢b Γ

Θ Γ ⊢ n rArr z int|z = n1

Θ ⊢b Γ

Θ Γ ⊢ TrArr z bool|z = T2

Θ ⊢b Γ

Θ Γ ⊢ FrArr z bool|z = F3

Θ ⊢b Γ

Θ Γ ⊢ () rArr z unit|z = ()4

Θ ⊢b Γx b[ϕ] isin Γ

Θ Γ ⊢ x rArr z b|z = x5

Θ Γ ⊢ v1 rArr z1 b1 |ϕ1

Θ Γ ⊢ v2 rArr z2 b2 |ϕ2

Θ Γ ⊢ (v1 v2) rArr z b1 lowast b2 |z = (v1 v2)6

union tid = Ci τii isin Θ

Θ Γ ⊢ v lArr τ

Θ Γ ⊢ Cj v rArr z tid |z = Cj v7

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v1 rArr z1 int|ϕ1

Θ Γ ⊢ v2 rArr z2 int|ϕ2

Φ∆ ΓΘ ⊢ v1 + v2 rArr z3 int|z3 = v1 + v28

Θ ⊢b ΦΘ Γ ⊢b ∆val f (x b[ϕ]) rarr τ isin ΦΘ Γ ⊢ v lArr z b|ϕΦ∆ ΓΘ ⊢ f v rArr τ [vx]

9

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆

Φ∆ ΓΘ ⊢ u rArr τ10

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ fst v rArr z b1 |z = fst v11

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ snd v rArr z b2 |z = snd v12

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ v lArr τ13

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ14

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ15

u lt dom(δ )Θ Γ ⊢ v lArr τΦ∆ u τ ΓΘ ⊢ s lArr τ2

Φ∆ ΓΘ ⊢ var u τ = v in s lArr τ216

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ u = v lArr z unit|⊤17

Φ∆ ΓΘ ⊢ s1 lArr z unit|⊤Φ∆ ΓΘ ⊢ s2 lArr τ

Φ∆ ΓΘ ⊢ s1 s2 lArr τ18

Θ Γ ⊢ v rArr x bool|ϕ1

Φ∆ ΓΘ ⊢ s1 lArr z1 b|(v = T and (ϕ1[vx])) =rArr (ϕ[z1z])Φ∆ ΓΘ ⊢ s2 lArr z2 b|(v = F and (ϕ1[vx])) =rArr (ϕ[z2z])

Φ∆ ΓΘ ⊢ if v then s1 else s2 lArr z b|ϕ19

Φ∆ ΓΘ ⊢ s1 lArr z bool|⊤Φ∆ ΓΘ ⊢ s2 lArr z unit|⊤

Φ∆ ΓΘ ⊢ while (s1) do s2 lArr z unit|⊤20

union tid = Ci zi bi |ϕii isin Θ

Θ Γ ⊢ v rArr z tid |ϕΦ∆ Γ xi bi[ϕi[xizi] and v = Ci xi and (ϕ[vz])]Θ ⊢ si lArr τ

i

Φ∆ ΓΘ ⊢ match v of Ci xi rArr siilArr τ

21

Θ Γ ⊢b z1 b|ϕ1

Θ Γ ⊢b z2 b|ϕ2

P Γ z3 b[ϕ1[z3z1]] |= ϕ2[z3z1]Θ Γ ⊢ z1 b|ϕ1 ≲ z2 b|ϕ2

22

Θ Γ ⊢ v rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Θ Γ ⊢ v lArr z1 b|ϕ123

Φ∆ ΓΘ ⊢ e rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Φ∆ ΓΘ ⊢ e lArr z1 b|ϕ124

Figure 5MiniSail Type System

the result of the operator For the case where e is fst (v1v2)or snd (v1v2) then the resulting statement is s with thee replaced by v1 or v2 For the case where e is a functionapplication f v then the reduction is to let x τ [vy] =

s prime[vy] in s where s prime is the body of the function f y is thefunction argument and τ is the return type of the functionWe substitute the value the function is applied to into boththe return type and the body of the function s prime This means

7

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 3: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

dependent type system based on liquid types [17] and us-ing an SMT solver to resolve constraints Even so liquidtypes are subtle and are easy to get wrong [21] This moti-vates a formalisation of the language and proofs that providesoundness guaranteesThe Sail language includes the usual constructs for an

imperative language if statements mutable variables as-signment and while loops Also included is pattern matchingin match statements and function arguments Pattern match-ing supports complex matching on bit patterns and this isvery useful when decoding bit patterns into instructions

As well as making it engineer friendly the imperativenature of language makes it possible to export to to C andOCaml to provide efficient execution of the model In thecase of the C export this performs sufficiently well thatan OS can be booted on this lsquovirtual CPUrsquo To support theformal validation of the model the specification can alsobe exported as theorem prover code for the Isabelle HOLand Coq theorem provers Instruction set models have beendeveloped for commercial CPUs such as ARM [1] as well asresearch architectures such as CHERI [22 23]A further motivation for a formal specification of Sail is

the following Instruction set architectures for commercialCPUs are long lived and evolve over decades and so if Sail isto be used to write these architectural models Sail shoulditself be stable over this time This stability could be achievedby freezing the language and its implementation but thiswouldmean that we can not take advantage of improvementsto SMT solvers or the implementation language A clearspecification for Sail that describes the language and typesystem would mitigate the risks associated with changes tothe implementation technology

To support the re-implementation of Sail we decided thatrather than attempt a specification of the full language wewould formalise a core-calculus of the language that containsthe key interesting features of Sail such as dependent-typesand imperative constructs This led us to MiniSail which isdescribed in the next section

3 MiniSail31 Kernel Language DesignBefore describing the syntax and semantics of MiniSail wemotivate our choice of Sail features that have been includedin MiniSail We have shown above that Sail has a depen-dent type system makes use of the Z3 SMT solver for typechecking and has some imperative language features suchas mutable variablesWe want the problems that we give to Z3 to be within a

decidable logic fragment for Z3 We choose as this fragmentquantifier-free linear arithmetic with algebraic datatypeswhich is decidable in Z3 The problems we pass to the solverare generated from the constraints within the types that pass

through the type checker Types arise from typing annota-tions in the MiniSail program or are synthesised by the typechecker We choose as values and expressions a representa-tive sample from Sail that will give us a variety of typingrules that generate types

Constraints in types can contain immutable variables butnot mutable variables so we need to ensure that synthesisedtypes will not contain mutable variables Also we need to il-lustrate that type checking for mutable variable binding andusage is sound as well so we include some of the imperativefeatures of Sail into MiniSail

Liquid typing [17] that is used in Sail provides flow typingso we choose to include if andmatch statements in MiniSaillanguage and their associated flow typing rules

32 SyntaxMiniSail is intended to be a calculus for Sail rather than a pro-gramming language and has been designed for mathematicalrather than programming convenience To achieve this thesyntax of MiniSail is based on let-normal form and a strat-ification of what are expressions in the Sail language intovalues expressions and statements in MiniSail This leadsto a corresponding stratification of the typing judgementsand makes reasoning about the type system clearer Thegrammar for values expressions and statements is shown inFigure 2 along with other constructs in the language

In the grammarwemake use of the followingmeta-symbols

bull n - Numeric literalsbull x - Immutable program variablesbull u - Mutable program variablesbull z - Variables bound in refinement typesbull f - Function namesbull tid - Type identifiers for datatypesbull C - Datatype constructors

We can motivate the assignment of a strata to a term asfollows Values include the things that we want the reductionof a program to produce and the things stored in mutablevariables As values we have number Boolean and bit vec-tor literals variables pairing of values construction of adatatype instance and the unit valueExpressions are how we obtain new values from old and

are function application application of the binary opera-tions + and le projection from a pair mutable variablesconcatenation of bit vectors and calculation of the length ofa bit vector Expressions can be formed from these operationsapplied to values only and so a complex expression such asa function application to the sum of two variables needs tobe written as a nesting of let-statements where the parts ofthe complex expression are bound to intermediate variablesThis ensures that the types of the components of complexexpressions are made explicit since the typing of the nestedlet-statements will give us the type of each sub-expression

3

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

We have two sorts of binding statements for variables Forimmutable variables we have the conventional let statementlet x = e in s that binds e to x in s and for mutable variableswe have var u τ = v in s Operationally this allocates aslot in the mutable variable store tou and putsv into the slotThe value can be accessed usingmutable variable expressionsin the statement s and subsequent assignments to u in s willupdate the value stored for u

In the previous section we introduced Sail types and howthey include type level variables and constraints on thesevariables For MiniSail we use a different style of syntaxfor types that is more in keeping with the Liquid type para-digm The form for a MiniSail type is z b | ϕ where z isthe value variable b is the base type and ϕ the refinementconstraint The base type is the universe from which valuesof the type will come from and the refinement constraintis a logical predicate that can reference the value variableand immutable program variables that are within scope Forexample z int | 0 le z is the type for all integers greaterthan or equal to zero The refinement constraint can includeexpressions However to ensure that the constraints fallwithin a logical fragment that is decidable in Z3 we excludefrom the expression syntax for constraints function applica-tion and mutable variable access types can only depend onimmutable variables This restricted form for expressions isindicated by eminus in the grammar

Within the language are constructs for declaring the typeof a function and for binding a function The argument vari-able for the function is given explicitly in the signature andit can appear in the refinement constraint for the argumentand the type for the functionrsquos return value We also havethe means to define new datatypes by associating a typeidentifier to a list of pairs C τ where C is a constructor andτ is the type that the constructor needs to be applied to givean instance of the datatype

33 Typing Judgements and ContextsThe MiniSail type system is described using a set of bidirec-tional [7] typing judgements For values and expressions theprincipal judgement is type synthesis which has the form_ ⊢ _rArr τ For statements the judgement is type checkingwhich has the form _ ⊢ _ lArr τ The left placeholder standsfor one or more contexts and the right placeholder by theterm Where we do need to check the type of an expressionor a value for example in function application we performtype synthesis and then a check that the synthesised typeis a subtype of the required type The key judgements andcontexts are listed in Figure 3Typing contexts are the structures that provide informa-

tion about variables and other names that might appear inthe terms that we are checking and importantly that mightappear in types Different sorts of information comes intoscope or moves out of scope at different times during the

type checking of a program and so we divide this informationinto four structures

bull For type definitions Θ This is a map from type iden-tifiers to datatype definitions This context remainsstatic during the evaluation of a program and so wewill not need lemmas relating to changes to the con-text such as weakening The well-scoping conditionsfor datatypes ensures that the definitions have no freevariables and so we donrsquot need to define a substitutionfunction for this contextbull For function definitionsΦ This is a map from functionnames to function definitions This shares the sameproperties as the Θ context given above Function bod-ies can reference only the immutable variable providedby the input argument functions anywhere in Φ anddatatype identifiers and constructors provided by a Θcontextbull For immutable variables Γ This is an ordered list oftriples of the form x b[ϕ] where x is an immutablevariable b is a base type and ϕ a constraint The con-straint can reference x and any other immutable vari-able given prior in the context as well as datatypeidentifiers and constructors provided by a Θ contextWe define a substitution function for this context andthe application of substitution to a Γ can result in thelsquocollapsersquo of the contextbull For mutable variables ∆ This is a map from a mutablevariable to a type The type does not have to closedand can reference immutable variables in the Γ contextand constructors from Θ

34 Sort CheckingTypes are specified in program text or built up using thetype synthesis rules At subtype checks the two types be-ing checked are used to build an SMT problem that is thenpassed to the solver We need to provide a guarantee that theconstraints within these types lead us to a problem statementthat falls within a decidable fragment of the SMT logic Thisfragment is quantifier-free linear arithmetic with algebraicdatatypes it is multi-sorted and is decidable As we will showlater in an example immutable variables appearing in thecontext and in types are mapped to variables in the SMTproblem and assigned a type in the SMT space We need toensure that variables are assigned valid types and that theproblem statement we construct sort checks

The sort checking judgement for expressions isΦΘ Γ∆ ⊢be b This says that e has base type (sort) b in the contexts onthe left hand side A small example of the sort checking rulesare shown in Figure 4 When expressions are compared in aconstraint we require that they both have the same sort andimportantly that they sort check in the context of emptyΦ and ∆ contexts This ensures that function applicationand mutable variables will not appear in the expressions

4

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

Value v = x | n | T | F | (vv ) | C v | ()Expression e = v | v +v | v le v | f v | u | fst v | snd v | len v | concat v v

Statement s = v | let x = e in s | let x τ = s in s| if v then s else s || match v C1 x1 rArr s1 Cn xn rArr sn | var u τ = v in s | u = v| while (s1) do s2

Base type b = int | bool | unit | bitvec | tid | b times bConstraint ϕ = T| F| eminus = eminus | eminus le eminus | ϕ and ϕ | ϕ or ϕ | ϕ rArr ϕ | notϕRefinement Type τ = z b | ϕ

Function Definition fd = val f (x b[ϕ]) rarr τfunction f (x ) = s

Datatype Definition td = union tid = C1 τ1 Cn τn Definition def = td | fdProgram p = def1 defn s

Figure 2 MiniSail Grammar

Φ Function definition context Θ Type definition contextΓ Immutable variable context ∆ Mutable variable context⊢b Θ Sort checking Θ Θ Γ ⊢b ∆ Sort checking ∆Θ ⊢b Φ Sort checking Φ Θ ⊢b Γ Sort checking Γ

Θ Γ ⊢b v b Sort checking values ΦΘ Γ∆ ⊢b e b Sort checking expressionsΘ Γ ⊢b ϕ Sort checking constraints Θ Γ ⊢b τ Sort checking types

Θ Γ ⊢ v rArr τ Type synthesis values Θ Γ ⊢ v lArr τ Type checking valuesΦ∆ ΓΘ ⊢ e rArr τ Type synthesis expressions Φ∆ ΓΘ ⊢ s lArr τ Type checking statementsΘ Γ ⊢ τ1 ≲ τ2 Subtyping Θ Γ |= ϕ Validity

Figure 3 MiniSail Contexts and Judgements

Sort checking will also ensure that for example the twoarguments to an addition operation have the int sortSort checking rules were added late in the development

of the mechanisation and replace and subsume the well-scoping rules that are used in the paper formalisation Thewell-scoping rules were only checking that for examplevariables appearing in a term are present in the Γ contextand were not checking that the base type of the variable wascompatible with the term it was appearing in

35 Subtyping and ValidityThe subtyping relation is Θ Γ ⊢ τ1 ≲ τ2 and this breaksdown to a validity relation Θ Γ |= ϕ1 minusrarr ϕ2 where ϕ1 andϕ2 are the refinement predicates for the types τ1 and τ2 TheSMT solver is used to check the validity relation

We will illustrate how subtyping leads to an SMT problemwith an extended example Suppose we have a function withthe signature

val f (x int times int [0 le fst x and 0 le snd x]) rarrz int | fst x le z and snd x le z

and that we need to prove the following typing judgementfor the application of this function to the pair of variables(ab)

Φ∆a int [a = 9]b int [b = 10]Θ ⊢ f (ab) rArr τ

where τ is z int | a le z and b le zThe derivation of the judgement will include this subtyp-

ing check

Θa int [a = 9]b int [b = 10] ⊢z int times int |z = (ab) ≲

x int times int |0 le fst x and 0 le snd x

that gives this validity checking judgement

Θa int [a = 9]b int [b = 10] z int times int [z = (ab)] |=0 le fst z and 0 le snd z

which is equivalent to checking the validity of this logicalexpression

a = 9 and b = 10 and z = (ab) minusrarr 0 le fst z and 0 le snd z

5

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

To check this we generate the following Z3 script where weask Z3 to check the satisfiability of the negated goal

(declare-datatypes (T1 T2)

((Pair (mk-pair (fst T1) (snd T2)))))

(declare-const a Int)

(declare-const b Int)

(declare-const z (Pair Int Int))

(define-fun constraint () Bool (and (= a 9) (= b 10)

(= z (mk-pair a b)) (not (and ( lt= 0 (fst z ))

( lt= 0 (snd z))))))

(assert constraint)

(check-sat)

The first line is a datatype declaration for a pair type Ifour context had included variables for a datatype then adeclaration for this type would appear here The next threelines declare the variables used one declaration for eachvariable in the context The Z3 type for the variable is derivedfrom theMiniSail base type for the variable as per the contextThe final three lines define the goal and request Z3 to checkthat the goal is satisfiable or not If the goal is unsatisfiablethen it means that the subtype check has succeeded

36 Typing RulesA MiniSail program is composed of a set of datatype defini-tions a set of function definitions and a single statement Tocheck a program we will sort check the datatype and thefunction definitions We will also check that the functionstype-check meaning that a function body has the type indi-cated by the functionrsquos return type in a context containingjust the functions argument and type Typing checking of astatement cascades down the statementrsquos structure and atpoints where checking of expressions or variables occurs itflips over to type synthesis with subtyping checks occurringat the change over points The important typing rules areshow in Figure 5

For all of the rules if a context appears in the rule conclu-sion but not in a typing premise then we need to include asort checking judgement as a premise For example in rule8 we include sort checking for ∆ as we are introducing the∆ context Sort checking of ∆ will sort check the types in ∆ensure that all free variables in the types are in the contextΓ and that all data constructors are in ΘFor the type synthesis rules for values we only require

the Θ and Γ contexts and the form for the type synthesisedfor a value v is always z b |z = v We are able to includev in the constraint as the rules check that v is well-sortedfor the base cases literals are automatically well-sorted andvariables are well sorted if they are in a well-sorted Γ context

As mutable variables and function applications are in-cluded as expressions we include the contexts ∆ and ΦThe type synthesised can not have the form z b |z = e

as not all expressions in our language are valid constraint-expressions For mutable variables the type synthesised isthe type of the variable as recorded in the ∆ context andfor function application expressions the type synthesisedis the return type of the function with substitution of theargument value into the return type

In a number of the rules for example function application(12) mutable variable assignment (16) and while loops (20)we are required to check that the variable or expression hasa particular type so we have a single type checking rule forvalues (rule 23) and for expressions (rule 24)

For statements we have just a checking judgement Thestatement var u τ = v in s that introduces and scopesmutable variables has a typing rule where we check v isa subtype of τ and that s has the required type with u τadded to the mutable variable context ∆ For assignmentstatements u = v we require the type of these to be unitand check that the type of v is a subtype of τ where u τ isan entry in ∆Typing for the if and thematch statement incorporates

flow typing This is where the predicate for the type beingchecked for in the branches is checked under the assumptionthat the discriminator has the value associated to the branchFor example in the positive branch of the if statement (rule19) we type check against a type that has the constraint v =T and (ϕ1[vx]) =rArr (ϕ[z1z]) For the match statement (rule21) we use a slightly different form Since we are binding anew variable xi in each branch we add this variable to thecontext with the constraintϕi [xizi ]andv = Ci xi While loopsare straight-forward - the condition needs to be a Booleanand the body the unit We donrsquot attempt flow typing for whileloops One difficulty is that the discriminator for the whilestatement is a statement it is not obvious how to includea constraint on this in the type checking of the positivebranch (while statement body) and the negative branch (thestatements after the while loop)

37 Operational SemanticsWe define the operational semantics using a small-step tran-sition relation

Φ ⊢ ⟨δ s⟩ minusrarr ⟨δ prime s prime⟩

Where δ is themutable store that maps frommutable variablenames to values and gives the current value of a mutablevariable For immutable variables we rely on the substitutionfunction to replace immutable variables with the value fromthe variablersquos binding statementWe now outline how the transition relation works on

different forms of statements For let statements let x =v in s we have a transition step case for each form for e Ife is a value then the reduction is to ⟨δ s[vx]⟩ For the casewhere e is the application of a binary operator and the twoarguments are integers then we will perform the calculationand the resulting statement is s with the e replaced with

6

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

Θ Γ ⊢b ∆Θ ⊢b ΦΘ Γ ⊢b v bval f (x b[ϕ]) rarr τ isin Φ

ΦΘ Γ∆ ⊢b f v b(τ )2

ϵ Θ Γ ϵ ⊢b e1 bϵ Θ Γ ϵ ⊢b e2 bΘ Γ ⊢b e1 = e2

1

Figure 4MiniSail Sort Checking

Θ ⊢b Γ

Θ Γ ⊢ n rArr z int|z = n1

Θ ⊢b Γ

Θ Γ ⊢ TrArr z bool|z = T2

Θ ⊢b Γ

Θ Γ ⊢ FrArr z bool|z = F3

Θ ⊢b Γ

Θ Γ ⊢ () rArr z unit|z = ()4

Θ ⊢b Γx b[ϕ] isin Γ

Θ Γ ⊢ x rArr z b|z = x5

Θ Γ ⊢ v1 rArr z1 b1 |ϕ1

Θ Γ ⊢ v2 rArr z2 b2 |ϕ2

Θ Γ ⊢ (v1 v2) rArr z b1 lowast b2 |z = (v1 v2)6

union tid = Ci τii isin Θ

Θ Γ ⊢ v lArr τ

Θ Γ ⊢ Cj v rArr z tid |z = Cj v7

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v1 rArr z1 int|ϕ1

Θ Γ ⊢ v2 rArr z2 int|ϕ2

Φ∆ ΓΘ ⊢ v1 + v2 rArr z3 int|z3 = v1 + v28

Θ ⊢b ΦΘ Γ ⊢b ∆val f (x b[ϕ]) rarr τ isin ΦΘ Γ ⊢ v lArr z b|ϕΦ∆ ΓΘ ⊢ f v rArr τ [vx]

9

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆

Φ∆ ΓΘ ⊢ u rArr τ10

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ fst v rArr z b1 |z = fst v11

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ snd v rArr z b2 |z = snd v12

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ v lArr τ13

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ14

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ15

u lt dom(δ )Θ Γ ⊢ v lArr τΦ∆ u τ ΓΘ ⊢ s lArr τ2

Φ∆ ΓΘ ⊢ var u τ = v in s lArr τ216

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ u = v lArr z unit|⊤17

Φ∆ ΓΘ ⊢ s1 lArr z unit|⊤Φ∆ ΓΘ ⊢ s2 lArr τ

Φ∆ ΓΘ ⊢ s1 s2 lArr τ18

Θ Γ ⊢ v rArr x bool|ϕ1

Φ∆ ΓΘ ⊢ s1 lArr z1 b|(v = T and (ϕ1[vx])) =rArr (ϕ[z1z])Φ∆ ΓΘ ⊢ s2 lArr z2 b|(v = F and (ϕ1[vx])) =rArr (ϕ[z2z])

Φ∆ ΓΘ ⊢ if v then s1 else s2 lArr z b|ϕ19

Φ∆ ΓΘ ⊢ s1 lArr z bool|⊤Φ∆ ΓΘ ⊢ s2 lArr z unit|⊤

Φ∆ ΓΘ ⊢ while (s1) do s2 lArr z unit|⊤20

union tid = Ci zi bi |ϕii isin Θ

Θ Γ ⊢ v rArr z tid |ϕΦ∆ Γ xi bi[ϕi[xizi] and v = Ci xi and (ϕ[vz])]Θ ⊢ si lArr τ

i

Φ∆ ΓΘ ⊢ match v of Ci xi rArr siilArr τ

21

Θ Γ ⊢b z1 b|ϕ1

Θ Γ ⊢b z2 b|ϕ2

P Γ z3 b[ϕ1[z3z1]] |= ϕ2[z3z1]Θ Γ ⊢ z1 b|ϕ1 ≲ z2 b|ϕ2

22

Θ Γ ⊢ v rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Θ Γ ⊢ v lArr z1 b|ϕ123

Φ∆ ΓΘ ⊢ e rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Φ∆ ΓΘ ⊢ e lArr z1 b|ϕ124

Figure 5MiniSail Type System

the result of the operator For the case where e is fst (v1v2)or snd (v1v2) then the resulting statement is s with thee replaced by v1 or v2 For the case where e is a functionapplication f v then the reduction is to let x τ [vy] =

s prime[vy] in s where s prime is the body of the function f y is thefunction argument and τ is the return type of the functionWe substitute the value the function is applied to into boththe return type and the body of the function s prime This means

7

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 4: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

We have two sorts of binding statements for variables Forimmutable variables we have the conventional let statementlet x = e in s that binds e to x in s and for mutable variableswe have var u τ = v in s Operationally this allocates aslot in the mutable variable store tou and putsv into the slotThe value can be accessed usingmutable variable expressionsin the statement s and subsequent assignments to u in s willupdate the value stored for u

In the previous section we introduced Sail types and howthey include type level variables and constraints on thesevariables For MiniSail we use a different style of syntaxfor types that is more in keeping with the Liquid type para-digm The form for a MiniSail type is z b | ϕ where z isthe value variable b is the base type and ϕ the refinementconstraint The base type is the universe from which valuesof the type will come from and the refinement constraintis a logical predicate that can reference the value variableand immutable program variables that are within scope Forexample z int | 0 le z is the type for all integers greaterthan or equal to zero The refinement constraint can includeexpressions However to ensure that the constraints fallwithin a logical fragment that is decidable in Z3 we excludefrom the expression syntax for constraints function applica-tion and mutable variable access types can only depend onimmutable variables This restricted form for expressions isindicated by eminus in the grammar

Within the language are constructs for declaring the typeof a function and for binding a function The argument vari-able for the function is given explicitly in the signature andit can appear in the refinement constraint for the argumentand the type for the functionrsquos return value We also havethe means to define new datatypes by associating a typeidentifier to a list of pairs C τ where C is a constructor andτ is the type that the constructor needs to be applied to givean instance of the datatype

33 Typing Judgements and ContextsThe MiniSail type system is described using a set of bidirec-tional [7] typing judgements For values and expressions theprincipal judgement is type synthesis which has the form_ ⊢ _rArr τ For statements the judgement is type checkingwhich has the form _ ⊢ _ lArr τ The left placeholder standsfor one or more contexts and the right placeholder by theterm Where we do need to check the type of an expressionor a value for example in function application we performtype synthesis and then a check that the synthesised typeis a subtype of the required type The key judgements andcontexts are listed in Figure 3Typing contexts are the structures that provide informa-

tion about variables and other names that might appear inthe terms that we are checking and importantly that mightappear in types Different sorts of information comes intoscope or moves out of scope at different times during the

type checking of a program and so we divide this informationinto four structures

bull For type definitions Θ This is a map from type iden-tifiers to datatype definitions This context remainsstatic during the evaluation of a program and so wewill not need lemmas relating to changes to the con-text such as weakening The well-scoping conditionsfor datatypes ensures that the definitions have no freevariables and so we donrsquot need to define a substitutionfunction for this contextbull For function definitionsΦ This is a map from functionnames to function definitions This shares the sameproperties as the Θ context given above Function bod-ies can reference only the immutable variable providedby the input argument functions anywhere in Φ anddatatype identifiers and constructors provided by a Θcontextbull For immutable variables Γ This is an ordered list oftriples of the form x b[ϕ] where x is an immutablevariable b is a base type and ϕ a constraint The con-straint can reference x and any other immutable vari-able given prior in the context as well as datatypeidentifiers and constructors provided by a Θ contextWe define a substitution function for this context andthe application of substitution to a Γ can result in thelsquocollapsersquo of the contextbull For mutable variables ∆ This is a map from a mutablevariable to a type The type does not have to closedand can reference immutable variables in the Γ contextand constructors from Θ

34 Sort CheckingTypes are specified in program text or built up using thetype synthesis rules At subtype checks the two types be-ing checked are used to build an SMT problem that is thenpassed to the solver We need to provide a guarantee that theconstraints within these types lead us to a problem statementthat falls within a decidable fragment of the SMT logic Thisfragment is quantifier-free linear arithmetic with algebraicdatatypes it is multi-sorted and is decidable As we will showlater in an example immutable variables appearing in thecontext and in types are mapped to variables in the SMTproblem and assigned a type in the SMT space We need toensure that variables are assigned valid types and that theproblem statement we construct sort checks

The sort checking judgement for expressions isΦΘ Γ∆ ⊢be b This says that e has base type (sort) b in the contexts onthe left hand side A small example of the sort checking rulesare shown in Figure 4 When expressions are compared in aconstraint we require that they both have the same sort andimportantly that they sort check in the context of emptyΦ and ∆ contexts This ensures that function applicationand mutable variables will not appear in the expressions

4

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

Value v = x | n | T | F | (vv ) | C v | ()Expression e = v | v +v | v le v | f v | u | fst v | snd v | len v | concat v v

Statement s = v | let x = e in s | let x τ = s in s| if v then s else s || match v C1 x1 rArr s1 Cn xn rArr sn | var u τ = v in s | u = v| while (s1) do s2

Base type b = int | bool | unit | bitvec | tid | b times bConstraint ϕ = T| F| eminus = eminus | eminus le eminus | ϕ and ϕ | ϕ or ϕ | ϕ rArr ϕ | notϕRefinement Type τ = z b | ϕ

Function Definition fd = val f (x b[ϕ]) rarr τfunction f (x ) = s

Datatype Definition td = union tid = C1 τ1 Cn τn Definition def = td | fdProgram p = def1 defn s

Figure 2 MiniSail Grammar

Φ Function definition context Θ Type definition contextΓ Immutable variable context ∆ Mutable variable context⊢b Θ Sort checking Θ Θ Γ ⊢b ∆ Sort checking ∆Θ ⊢b Φ Sort checking Φ Θ ⊢b Γ Sort checking Γ

Θ Γ ⊢b v b Sort checking values ΦΘ Γ∆ ⊢b e b Sort checking expressionsΘ Γ ⊢b ϕ Sort checking constraints Θ Γ ⊢b τ Sort checking types

Θ Γ ⊢ v rArr τ Type synthesis values Θ Γ ⊢ v lArr τ Type checking valuesΦ∆ ΓΘ ⊢ e rArr τ Type synthesis expressions Φ∆ ΓΘ ⊢ s lArr τ Type checking statementsΘ Γ ⊢ τ1 ≲ τ2 Subtyping Θ Γ |= ϕ Validity

Figure 3 MiniSail Contexts and Judgements

Sort checking will also ensure that for example the twoarguments to an addition operation have the int sortSort checking rules were added late in the development

of the mechanisation and replace and subsume the well-scoping rules that are used in the paper formalisation Thewell-scoping rules were only checking that for examplevariables appearing in a term are present in the Γ contextand were not checking that the base type of the variable wascompatible with the term it was appearing in

35 Subtyping and ValidityThe subtyping relation is Θ Γ ⊢ τ1 ≲ τ2 and this breaksdown to a validity relation Θ Γ |= ϕ1 minusrarr ϕ2 where ϕ1 andϕ2 are the refinement predicates for the types τ1 and τ2 TheSMT solver is used to check the validity relation

We will illustrate how subtyping leads to an SMT problemwith an extended example Suppose we have a function withthe signature

val f (x int times int [0 le fst x and 0 le snd x]) rarrz int | fst x le z and snd x le z

and that we need to prove the following typing judgementfor the application of this function to the pair of variables(ab)

Φ∆a int [a = 9]b int [b = 10]Θ ⊢ f (ab) rArr τ

where τ is z int | a le z and b le zThe derivation of the judgement will include this subtyp-

ing check

Θa int [a = 9]b int [b = 10] ⊢z int times int |z = (ab) ≲

x int times int |0 le fst x and 0 le snd x

that gives this validity checking judgement

Θa int [a = 9]b int [b = 10] z int times int [z = (ab)] |=0 le fst z and 0 le snd z

which is equivalent to checking the validity of this logicalexpression

a = 9 and b = 10 and z = (ab) minusrarr 0 le fst z and 0 le snd z

5

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

To check this we generate the following Z3 script where weask Z3 to check the satisfiability of the negated goal

(declare-datatypes (T1 T2)

((Pair (mk-pair (fst T1) (snd T2)))))

(declare-const a Int)

(declare-const b Int)

(declare-const z (Pair Int Int))

(define-fun constraint () Bool (and (= a 9) (= b 10)

(= z (mk-pair a b)) (not (and ( lt= 0 (fst z ))

( lt= 0 (snd z))))))

(assert constraint)

(check-sat)

The first line is a datatype declaration for a pair type Ifour context had included variables for a datatype then adeclaration for this type would appear here The next threelines declare the variables used one declaration for eachvariable in the context The Z3 type for the variable is derivedfrom theMiniSail base type for the variable as per the contextThe final three lines define the goal and request Z3 to checkthat the goal is satisfiable or not If the goal is unsatisfiablethen it means that the subtype check has succeeded

36 Typing RulesA MiniSail program is composed of a set of datatype defini-tions a set of function definitions and a single statement Tocheck a program we will sort check the datatype and thefunction definitions We will also check that the functionstype-check meaning that a function body has the type indi-cated by the functionrsquos return type in a context containingjust the functions argument and type Typing checking of astatement cascades down the statementrsquos structure and atpoints where checking of expressions or variables occurs itflips over to type synthesis with subtyping checks occurringat the change over points The important typing rules areshow in Figure 5

For all of the rules if a context appears in the rule conclu-sion but not in a typing premise then we need to include asort checking judgement as a premise For example in rule8 we include sort checking for ∆ as we are introducing the∆ context Sort checking of ∆ will sort check the types in ∆ensure that all free variables in the types are in the contextΓ and that all data constructors are in ΘFor the type synthesis rules for values we only require

the Θ and Γ contexts and the form for the type synthesisedfor a value v is always z b |z = v We are able to includev in the constraint as the rules check that v is well-sortedfor the base cases literals are automatically well-sorted andvariables are well sorted if they are in a well-sorted Γ context

As mutable variables and function applications are in-cluded as expressions we include the contexts ∆ and ΦThe type synthesised can not have the form z b |z = e

as not all expressions in our language are valid constraint-expressions For mutable variables the type synthesised isthe type of the variable as recorded in the ∆ context andfor function application expressions the type synthesisedis the return type of the function with substitution of theargument value into the return type

In a number of the rules for example function application(12) mutable variable assignment (16) and while loops (20)we are required to check that the variable or expression hasa particular type so we have a single type checking rule forvalues (rule 23) and for expressions (rule 24)

For statements we have just a checking judgement Thestatement var u τ = v in s that introduces and scopesmutable variables has a typing rule where we check v isa subtype of τ and that s has the required type with u τadded to the mutable variable context ∆ For assignmentstatements u = v we require the type of these to be unitand check that the type of v is a subtype of τ where u τ isan entry in ∆Typing for the if and thematch statement incorporates

flow typing This is where the predicate for the type beingchecked for in the branches is checked under the assumptionthat the discriminator has the value associated to the branchFor example in the positive branch of the if statement (rule19) we type check against a type that has the constraint v =T and (ϕ1[vx]) =rArr (ϕ[z1z]) For the match statement (rule21) we use a slightly different form Since we are binding anew variable xi in each branch we add this variable to thecontext with the constraintϕi [xizi ]andv = Ci xi While loopsare straight-forward - the condition needs to be a Booleanand the body the unit We donrsquot attempt flow typing for whileloops One difficulty is that the discriminator for the whilestatement is a statement it is not obvious how to includea constraint on this in the type checking of the positivebranch (while statement body) and the negative branch (thestatements after the while loop)

37 Operational SemanticsWe define the operational semantics using a small-step tran-sition relation

Φ ⊢ ⟨δ s⟩ minusrarr ⟨δ prime s prime⟩

Where δ is themutable store that maps frommutable variablenames to values and gives the current value of a mutablevariable For immutable variables we rely on the substitutionfunction to replace immutable variables with the value fromthe variablersquos binding statementWe now outline how the transition relation works on

different forms of statements For let statements let x =v in s we have a transition step case for each form for e Ife is a value then the reduction is to ⟨δ s[vx]⟩ For the casewhere e is the application of a binary operator and the twoarguments are integers then we will perform the calculationand the resulting statement is s with the e replaced with

6

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

Θ Γ ⊢b ∆Θ ⊢b ΦΘ Γ ⊢b v bval f (x b[ϕ]) rarr τ isin Φ

ΦΘ Γ∆ ⊢b f v b(τ )2

ϵ Θ Γ ϵ ⊢b e1 bϵ Θ Γ ϵ ⊢b e2 bΘ Γ ⊢b e1 = e2

1

Figure 4MiniSail Sort Checking

Θ ⊢b Γ

Θ Γ ⊢ n rArr z int|z = n1

Θ ⊢b Γ

Θ Γ ⊢ TrArr z bool|z = T2

Θ ⊢b Γ

Θ Γ ⊢ FrArr z bool|z = F3

Θ ⊢b Γ

Θ Γ ⊢ () rArr z unit|z = ()4

Θ ⊢b Γx b[ϕ] isin Γ

Θ Γ ⊢ x rArr z b|z = x5

Θ Γ ⊢ v1 rArr z1 b1 |ϕ1

Θ Γ ⊢ v2 rArr z2 b2 |ϕ2

Θ Γ ⊢ (v1 v2) rArr z b1 lowast b2 |z = (v1 v2)6

union tid = Ci τii isin Θ

Θ Γ ⊢ v lArr τ

Θ Γ ⊢ Cj v rArr z tid |z = Cj v7

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v1 rArr z1 int|ϕ1

Θ Γ ⊢ v2 rArr z2 int|ϕ2

Φ∆ ΓΘ ⊢ v1 + v2 rArr z3 int|z3 = v1 + v28

Θ ⊢b ΦΘ Γ ⊢b ∆val f (x b[ϕ]) rarr τ isin ΦΘ Γ ⊢ v lArr z b|ϕΦ∆ ΓΘ ⊢ f v rArr τ [vx]

9

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆

Φ∆ ΓΘ ⊢ u rArr τ10

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ fst v rArr z b1 |z = fst v11

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ snd v rArr z b2 |z = snd v12

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ v lArr τ13

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ14

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ15

u lt dom(δ )Θ Γ ⊢ v lArr τΦ∆ u τ ΓΘ ⊢ s lArr τ2

Φ∆ ΓΘ ⊢ var u τ = v in s lArr τ216

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ u = v lArr z unit|⊤17

Φ∆ ΓΘ ⊢ s1 lArr z unit|⊤Φ∆ ΓΘ ⊢ s2 lArr τ

Φ∆ ΓΘ ⊢ s1 s2 lArr τ18

Θ Γ ⊢ v rArr x bool|ϕ1

Φ∆ ΓΘ ⊢ s1 lArr z1 b|(v = T and (ϕ1[vx])) =rArr (ϕ[z1z])Φ∆ ΓΘ ⊢ s2 lArr z2 b|(v = F and (ϕ1[vx])) =rArr (ϕ[z2z])

Φ∆ ΓΘ ⊢ if v then s1 else s2 lArr z b|ϕ19

Φ∆ ΓΘ ⊢ s1 lArr z bool|⊤Φ∆ ΓΘ ⊢ s2 lArr z unit|⊤

Φ∆ ΓΘ ⊢ while (s1) do s2 lArr z unit|⊤20

union tid = Ci zi bi |ϕii isin Θ

Θ Γ ⊢ v rArr z tid |ϕΦ∆ Γ xi bi[ϕi[xizi] and v = Ci xi and (ϕ[vz])]Θ ⊢ si lArr τ

i

Φ∆ ΓΘ ⊢ match v of Ci xi rArr siilArr τ

21

Θ Γ ⊢b z1 b|ϕ1

Θ Γ ⊢b z2 b|ϕ2

P Γ z3 b[ϕ1[z3z1]] |= ϕ2[z3z1]Θ Γ ⊢ z1 b|ϕ1 ≲ z2 b|ϕ2

22

Θ Γ ⊢ v rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Θ Γ ⊢ v lArr z1 b|ϕ123

Φ∆ ΓΘ ⊢ e rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Φ∆ ΓΘ ⊢ e lArr z1 b|ϕ124

Figure 5MiniSail Type System

the result of the operator For the case where e is fst (v1v2)or snd (v1v2) then the resulting statement is s with thee replaced by v1 or v2 For the case where e is a functionapplication f v then the reduction is to let x τ [vy] =

s prime[vy] in s where s prime is the body of the function f y is thefunction argument and τ is the return type of the functionWe substitute the value the function is applied to into boththe return type and the body of the function s prime This means

7

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 5: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

550

Value v = x | n | T | F | (vv ) | C v | ()Expression e = v | v +v | v le v | f v | u | fst v | snd v | len v | concat v v

Statement s = v | let x = e in s | let x τ = s in s| if v then s else s || match v C1 x1 rArr s1 Cn xn rArr sn | var u τ = v in s | u = v| while (s1) do s2

Base type b = int | bool | unit | bitvec | tid | b times bConstraint ϕ = T| F| eminus = eminus | eminus le eminus | ϕ and ϕ | ϕ or ϕ | ϕ rArr ϕ | notϕRefinement Type τ = z b | ϕ

Function Definition fd = val f (x b[ϕ]) rarr τfunction f (x ) = s

Datatype Definition td = union tid = C1 τ1 Cn τn Definition def = td | fdProgram p = def1 defn s

Figure 2 MiniSail Grammar

Φ Function definition context Θ Type definition contextΓ Immutable variable context ∆ Mutable variable context⊢b Θ Sort checking Θ Θ Γ ⊢b ∆ Sort checking ∆Θ ⊢b Φ Sort checking Φ Θ ⊢b Γ Sort checking Γ

Θ Γ ⊢b v b Sort checking values ΦΘ Γ∆ ⊢b e b Sort checking expressionsΘ Γ ⊢b ϕ Sort checking constraints Θ Γ ⊢b τ Sort checking types

Θ Γ ⊢ v rArr τ Type synthesis values Θ Γ ⊢ v lArr τ Type checking valuesΦ∆ ΓΘ ⊢ e rArr τ Type synthesis expressions Φ∆ ΓΘ ⊢ s lArr τ Type checking statementsΘ Γ ⊢ τ1 ≲ τ2 Subtyping Θ Γ |= ϕ Validity

Figure 3 MiniSail Contexts and Judgements

Sort checking will also ensure that for example the twoarguments to an addition operation have the int sortSort checking rules were added late in the development

of the mechanisation and replace and subsume the well-scoping rules that are used in the paper formalisation Thewell-scoping rules were only checking that for examplevariables appearing in a term are present in the Γ contextand were not checking that the base type of the variable wascompatible with the term it was appearing in

35 Subtyping and ValidityThe subtyping relation is Θ Γ ⊢ τ1 ≲ τ2 and this breaksdown to a validity relation Θ Γ |= ϕ1 minusrarr ϕ2 where ϕ1 andϕ2 are the refinement predicates for the types τ1 and τ2 TheSMT solver is used to check the validity relation

We will illustrate how subtyping leads to an SMT problemwith an extended example Suppose we have a function withthe signature

val f (x int times int [0 le fst x and 0 le snd x]) rarrz int | fst x le z and snd x le z

and that we need to prove the following typing judgementfor the application of this function to the pair of variables(ab)

Φ∆a int [a = 9]b int [b = 10]Θ ⊢ f (ab) rArr τ

where τ is z int | a le z and b le zThe derivation of the judgement will include this subtyp-

ing check

Θa int [a = 9]b int [b = 10] ⊢z int times int |z = (ab) ≲

x int times int |0 le fst x and 0 le snd x

that gives this validity checking judgement

Θa int [a = 9]b int [b = 10] z int times int [z = (ab)] |=0 le fst z and 0 le snd z

which is equivalent to checking the validity of this logicalexpression

a = 9 and b = 10 and z = (ab) minusrarr 0 le fst z and 0 le snd z

5

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

To check this we generate the following Z3 script where weask Z3 to check the satisfiability of the negated goal

(declare-datatypes (T1 T2)

((Pair (mk-pair (fst T1) (snd T2)))))

(declare-const a Int)

(declare-const b Int)

(declare-const z (Pair Int Int))

(define-fun constraint () Bool (and (= a 9) (= b 10)

(= z (mk-pair a b)) (not (and ( lt= 0 (fst z ))

( lt= 0 (snd z))))))

(assert constraint)

(check-sat)

The first line is a datatype declaration for a pair type Ifour context had included variables for a datatype then adeclaration for this type would appear here The next threelines declare the variables used one declaration for eachvariable in the context The Z3 type for the variable is derivedfrom theMiniSail base type for the variable as per the contextThe final three lines define the goal and request Z3 to checkthat the goal is satisfiable or not If the goal is unsatisfiablethen it means that the subtype check has succeeded

36 Typing RulesA MiniSail program is composed of a set of datatype defini-tions a set of function definitions and a single statement Tocheck a program we will sort check the datatype and thefunction definitions We will also check that the functionstype-check meaning that a function body has the type indi-cated by the functionrsquos return type in a context containingjust the functions argument and type Typing checking of astatement cascades down the statementrsquos structure and atpoints where checking of expressions or variables occurs itflips over to type synthesis with subtyping checks occurringat the change over points The important typing rules areshow in Figure 5

For all of the rules if a context appears in the rule conclu-sion but not in a typing premise then we need to include asort checking judgement as a premise For example in rule8 we include sort checking for ∆ as we are introducing the∆ context Sort checking of ∆ will sort check the types in ∆ensure that all free variables in the types are in the contextΓ and that all data constructors are in ΘFor the type synthesis rules for values we only require

the Θ and Γ contexts and the form for the type synthesisedfor a value v is always z b |z = v We are able to includev in the constraint as the rules check that v is well-sortedfor the base cases literals are automatically well-sorted andvariables are well sorted if they are in a well-sorted Γ context

As mutable variables and function applications are in-cluded as expressions we include the contexts ∆ and ΦThe type synthesised can not have the form z b |z = e

as not all expressions in our language are valid constraint-expressions For mutable variables the type synthesised isthe type of the variable as recorded in the ∆ context andfor function application expressions the type synthesisedis the return type of the function with substitution of theargument value into the return type

In a number of the rules for example function application(12) mutable variable assignment (16) and while loops (20)we are required to check that the variable or expression hasa particular type so we have a single type checking rule forvalues (rule 23) and for expressions (rule 24)

For statements we have just a checking judgement Thestatement var u τ = v in s that introduces and scopesmutable variables has a typing rule where we check v isa subtype of τ and that s has the required type with u τadded to the mutable variable context ∆ For assignmentstatements u = v we require the type of these to be unitand check that the type of v is a subtype of τ where u τ isan entry in ∆Typing for the if and thematch statement incorporates

flow typing This is where the predicate for the type beingchecked for in the branches is checked under the assumptionthat the discriminator has the value associated to the branchFor example in the positive branch of the if statement (rule19) we type check against a type that has the constraint v =T and (ϕ1[vx]) =rArr (ϕ[z1z]) For the match statement (rule21) we use a slightly different form Since we are binding anew variable xi in each branch we add this variable to thecontext with the constraintϕi [xizi ]andv = Ci xi While loopsare straight-forward - the condition needs to be a Booleanand the body the unit We donrsquot attempt flow typing for whileloops One difficulty is that the discriminator for the whilestatement is a statement it is not obvious how to includea constraint on this in the type checking of the positivebranch (while statement body) and the negative branch (thestatements after the while loop)

37 Operational SemanticsWe define the operational semantics using a small-step tran-sition relation

Φ ⊢ ⟨δ s⟩ minusrarr ⟨δ prime s prime⟩

Where δ is themutable store that maps frommutable variablenames to values and gives the current value of a mutablevariable For immutable variables we rely on the substitutionfunction to replace immutable variables with the value fromthe variablersquos binding statementWe now outline how the transition relation works on

different forms of statements For let statements let x =v in s we have a transition step case for each form for e Ife is a value then the reduction is to ⟨δ s[vx]⟩ For the casewhere e is the application of a binary operator and the twoarguments are integers then we will perform the calculationand the resulting statement is s with the e replaced with

6

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

Θ Γ ⊢b ∆Θ ⊢b ΦΘ Γ ⊢b v bval f (x b[ϕ]) rarr τ isin Φ

ΦΘ Γ∆ ⊢b f v b(τ )2

ϵ Θ Γ ϵ ⊢b e1 bϵ Θ Γ ϵ ⊢b e2 bΘ Γ ⊢b e1 = e2

1

Figure 4MiniSail Sort Checking

Θ ⊢b Γ

Θ Γ ⊢ n rArr z int|z = n1

Θ ⊢b Γ

Θ Γ ⊢ TrArr z bool|z = T2

Θ ⊢b Γ

Θ Γ ⊢ FrArr z bool|z = F3

Θ ⊢b Γ

Θ Γ ⊢ () rArr z unit|z = ()4

Θ ⊢b Γx b[ϕ] isin Γ

Θ Γ ⊢ x rArr z b|z = x5

Θ Γ ⊢ v1 rArr z1 b1 |ϕ1

Θ Γ ⊢ v2 rArr z2 b2 |ϕ2

Θ Γ ⊢ (v1 v2) rArr z b1 lowast b2 |z = (v1 v2)6

union tid = Ci τii isin Θ

Θ Γ ⊢ v lArr τ

Θ Γ ⊢ Cj v rArr z tid |z = Cj v7

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v1 rArr z1 int|ϕ1

Θ Γ ⊢ v2 rArr z2 int|ϕ2

Φ∆ ΓΘ ⊢ v1 + v2 rArr z3 int|z3 = v1 + v28

Θ ⊢b ΦΘ Γ ⊢b ∆val f (x b[ϕ]) rarr τ isin ΦΘ Γ ⊢ v lArr z b|ϕΦ∆ ΓΘ ⊢ f v rArr τ [vx]

9

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆

Φ∆ ΓΘ ⊢ u rArr τ10

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ fst v rArr z b1 |z = fst v11

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ snd v rArr z b2 |z = snd v12

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ v lArr τ13

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ14

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ15

u lt dom(δ )Θ Γ ⊢ v lArr τΦ∆ u τ ΓΘ ⊢ s lArr τ2

Φ∆ ΓΘ ⊢ var u τ = v in s lArr τ216

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ u = v lArr z unit|⊤17

Φ∆ ΓΘ ⊢ s1 lArr z unit|⊤Φ∆ ΓΘ ⊢ s2 lArr τ

Φ∆ ΓΘ ⊢ s1 s2 lArr τ18

Θ Γ ⊢ v rArr x bool|ϕ1

Φ∆ ΓΘ ⊢ s1 lArr z1 b|(v = T and (ϕ1[vx])) =rArr (ϕ[z1z])Φ∆ ΓΘ ⊢ s2 lArr z2 b|(v = F and (ϕ1[vx])) =rArr (ϕ[z2z])

Φ∆ ΓΘ ⊢ if v then s1 else s2 lArr z b|ϕ19

Φ∆ ΓΘ ⊢ s1 lArr z bool|⊤Φ∆ ΓΘ ⊢ s2 lArr z unit|⊤

Φ∆ ΓΘ ⊢ while (s1) do s2 lArr z unit|⊤20

union tid = Ci zi bi |ϕii isin Θ

Θ Γ ⊢ v rArr z tid |ϕΦ∆ Γ xi bi[ϕi[xizi] and v = Ci xi and (ϕ[vz])]Θ ⊢ si lArr τ

i

Φ∆ ΓΘ ⊢ match v of Ci xi rArr siilArr τ

21

Θ Γ ⊢b z1 b|ϕ1

Θ Γ ⊢b z2 b|ϕ2

P Γ z3 b[ϕ1[z3z1]] |= ϕ2[z3z1]Θ Γ ⊢ z1 b|ϕ1 ≲ z2 b|ϕ2

22

Θ Γ ⊢ v rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Θ Γ ⊢ v lArr z1 b|ϕ123

Φ∆ ΓΘ ⊢ e rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Φ∆ ΓΘ ⊢ e lArr z1 b|ϕ124

Figure 5MiniSail Type System

the result of the operator For the case where e is fst (v1v2)or snd (v1v2) then the resulting statement is s with thee replaced by v1 or v2 For the case where e is a functionapplication f v then the reduction is to let x τ [vy] =

s prime[vy] in s where s prime is the body of the function f y is thefunction argument and τ is the return type of the functionWe substitute the value the function is applied to into boththe return type and the body of the function s prime This means

7

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 6: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

595

596

597

598

599

600

601

602

603

604

605

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

To check this we generate the following Z3 script where weask Z3 to check the satisfiability of the negated goal

(declare-datatypes (T1 T2)

((Pair (mk-pair (fst T1) (snd T2)))))

(declare-const a Int)

(declare-const b Int)

(declare-const z (Pair Int Int))

(define-fun constraint () Bool (and (= a 9) (= b 10)

(= z (mk-pair a b)) (not (and ( lt= 0 (fst z ))

( lt= 0 (snd z))))))

(assert constraint)

(check-sat)

The first line is a datatype declaration for a pair type Ifour context had included variables for a datatype then adeclaration for this type would appear here The next threelines declare the variables used one declaration for eachvariable in the context The Z3 type for the variable is derivedfrom theMiniSail base type for the variable as per the contextThe final three lines define the goal and request Z3 to checkthat the goal is satisfiable or not If the goal is unsatisfiablethen it means that the subtype check has succeeded

36 Typing RulesA MiniSail program is composed of a set of datatype defini-tions a set of function definitions and a single statement Tocheck a program we will sort check the datatype and thefunction definitions We will also check that the functionstype-check meaning that a function body has the type indi-cated by the functionrsquos return type in a context containingjust the functions argument and type Typing checking of astatement cascades down the statementrsquos structure and atpoints where checking of expressions or variables occurs itflips over to type synthesis with subtyping checks occurringat the change over points The important typing rules areshow in Figure 5

For all of the rules if a context appears in the rule conclu-sion but not in a typing premise then we need to include asort checking judgement as a premise For example in rule8 we include sort checking for ∆ as we are introducing the∆ context Sort checking of ∆ will sort check the types in ∆ensure that all free variables in the types are in the contextΓ and that all data constructors are in ΘFor the type synthesis rules for values we only require

the Θ and Γ contexts and the form for the type synthesisedfor a value v is always z b |z = v We are able to includev in the constraint as the rules check that v is well-sortedfor the base cases literals are automatically well-sorted andvariables are well sorted if they are in a well-sorted Γ context

As mutable variables and function applications are in-cluded as expressions we include the contexts ∆ and ΦThe type synthesised can not have the form z b |z = e

as not all expressions in our language are valid constraint-expressions For mutable variables the type synthesised isthe type of the variable as recorded in the ∆ context andfor function application expressions the type synthesisedis the return type of the function with substitution of theargument value into the return type

In a number of the rules for example function application(12) mutable variable assignment (16) and while loops (20)we are required to check that the variable or expression hasa particular type so we have a single type checking rule forvalues (rule 23) and for expressions (rule 24)

For statements we have just a checking judgement Thestatement var u τ = v in s that introduces and scopesmutable variables has a typing rule where we check v isa subtype of τ and that s has the required type with u τadded to the mutable variable context ∆ For assignmentstatements u = v we require the type of these to be unitand check that the type of v is a subtype of τ where u τ isan entry in ∆Typing for the if and thematch statement incorporates

flow typing This is where the predicate for the type beingchecked for in the branches is checked under the assumptionthat the discriminator has the value associated to the branchFor example in the positive branch of the if statement (rule19) we type check against a type that has the constraint v =T and (ϕ1[vx]) =rArr (ϕ[z1z]) For the match statement (rule21) we use a slightly different form Since we are binding anew variable xi in each branch we add this variable to thecontext with the constraintϕi [xizi ]andv = Ci xi While loopsare straight-forward - the condition needs to be a Booleanand the body the unit We donrsquot attempt flow typing for whileloops One difficulty is that the discriminator for the whilestatement is a statement it is not obvious how to includea constraint on this in the type checking of the positivebranch (while statement body) and the negative branch (thestatements after the while loop)

37 Operational SemanticsWe define the operational semantics using a small-step tran-sition relation

Φ ⊢ ⟨δ s⟩ minusrarr ⟨δ prime s prime⟩

Where δ is themutable store that maps frommutable variablenames to values and gives the current value of a mutablevariable For immutable variables we rely on the substitutionfunction to replace immutable variables with the value fromthe variablersquos binding statementWe now outline how the transition relation works on

different forms of statements For let statements let x =v in s we have a transition step case for each form for e Ife is a value then the reduction is to ⟨δ s[vx]⟩ For the casewhere e is the application of a binary operator and the twoarguments are integers then we will perform the calculationand the resulting statement is s with the e replaced with

6

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

Θ Γ ⊢b ∆Θ ⊢b ΦΘ Γ ⊢b v bval f (x b[ϕ]) rarr τ isin Φ

ΦΘ Γ∆ ⊢b f v b(τ )2

ϵ Θ Γ ϵ ⊢b e1 bϵ Θ Γ ϵ ⊢b e2 bΘ Γ ⊢b e1 = e2

1

Figure 4MiniSail Sort Checking

Θ ⊢b Γ

Θ Γ ⊢ n rArr z int|z = n1

Θ ⊢b Γ

Θ Γ ⊢ TrArr z bool|z = T2

Θ ⊢b Γ

Θ Γ ⊢ FrArr z bool|z = F3

Θ ⊢b Γ

Θ Γ ⊢ () rArr z unit|z = ()4

Θ ⊢b Γx b[ϕ] isin Γ

Θ Γ ⊢ x rArr z b|z = x5

Θ Γ ⊢ v1 rArr z1 b1 |ϕ1

Θ Γ ⊢ v2 rArr z2 b2 |ϕ2

Θ Γ ⊢ (v1 v2) rArr z b1 lowast b2 |z = (v1 v2)6

union tid = Ci τii isin Θ

Θ Γ ⊢ v lArr τ

Θ Γ ⊢ Cj v rArr z tid |z = Cj v7

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v1 rArr z1 int|ϕ1

Θ Γ ⊢ v2 rArr z2 int|ϕ2

Φ∆ ΓΘ ⊢ v1 + v2 rArr z3 int|z3 = v1 + v28

Θ ⊢b ΦΘ Γ ⊢b ∆val f (x b[ϕ]) rarr τ isin ΦΘ Γ ⊢ v lArr z b|ϕΦ∆ ΓΘ ⊢ f v rArr τ [vx]

9

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆

Φ∆ ΓΘ ⊢ u rArr τ10

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ fst v rArr z b1 |z = fst v11

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ snd v rArr z b2 |z = snd v12

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ v lArr τ13

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ14

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ15

u lt dom(δ )Θ Γ ⊢ v lArr τΦ∆ u τ ΓΘ ⊢ s lArr τ2

Φ∆ ΓΘ ⊢ var u τ = v in s lArr τ216

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ u = v lArr z unit|⊤17

Φ∆ ΓΘ ⊢ s1 lArr z unit|⊤Φ∆ ΓΘ ⊢ s2 lArr τ

Φ∆ ΓΘ ⊢ s1 s2 lArr τ18

Θ Γ ⊢ v rArr x bool|ϕ1

Φ∆ ΓΘ ⊢ s1 lArr z1 b|(v = T and (ϕ1[vx])) =rArr (ϕ[z1z])Φ∆ ΓΘ ⊢ s2 lArr z2 b|(v = F and (ϕ1[vx])) =rArr (ϕ[z2z])

Φ∆ ΓΘ ⊢ if v then s1 else s2 lArr z b|ϕ19

Φ∆ ΓΘ ⊢ s1 lArr z bool|⊤Φ∆ ΓΘ ⊢ s2 lArr z unit|⊤

Φ∆ ΓΘ ⊢ while (s1) do s2 lArr z unit|⊤20

union tid = Ci zi bi |ϕii isin Θ

Θ Γ ⊢ v rArr z tid |ϕΦ∆ Γ xi bi[ϕi[xizi] and v = Ci xi and (ϕ[vz])]Θ ⊢ si lArr τ

i

Φ∆ ΓΘ ⊢ match v of Ci xi rArr siilArr τ

21

Θ Γ ⊢b z1 b|ϕ1

Θ Γ ⊢b z2 b|ϕ2

P Γ z3 b[ϕ1[z3z1]] |= ϕ2[z3z1]Θ Γ ⊢ z1 b|ϕ1 ≲ z2 b|ϕ2

22

Θ Γ ⊢ v rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Θ Γ ⊢ v lArr z1 b|ϕ123

Φ∆ ΓΘ ⊢ e rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Φ∆ ΓΘ ⊢ e lArr z1 b|ϕ124

Figure 5MiniSail Type System

the result of the operator For the case where e is fst (v1v2)or snd (v1v2) then the resulting statement is s with thee replaced by v1 or v2 For the case where e is a functionapplication f v then the reduction is to let x τ [vy] =

s prime[vy] in s where s prime is the body of the function f y is thefunction argument and τ is the return type of the functionWe substitute the value the function is applied to into boththe return type and the body of the function s prime This means

7

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 7: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748

749

750

751

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768

769

770

Θ Γ ⊢b ∆Θ ⊢b ΦΘ Γ ⊢b v bval f (x b[ϕ]) rarr τ isin Φ

ΦΘ Γ∆ ⊢b f v b(τ )2

ϵ Θ Γ ϵ ⊢b e1 bϵ Θ Γ ϵ ⊢b e2 bΘ Γ ⊢b e1 = e2

1

Figure 4MiniSail Sort Checking

Θ ⊢b Γ

Θ Γ ⊢ n rArr z int|z = n1

Θ ⊢b Γ

Θ Γ ⊢ TrArr z bool|z = T2

Θ ⊢b Γ

Θ Γ ⊢ FrArr z bool|z = F3

Θ ⊢b Γ

Θ Γ ⊢ () rArr z unit|z = ()4

Θ ⊢b Γx b[ϕ] isin Γ

Θ Γ ⊢ x rArr z b|z = x5

Θ Γ ⊢ v1 rArr z1 b1 |ϕ1

Θ Γ ⊢ v2 rArr z2 b2 |ϕ2

Θ Γ ⊢ (v1 v2) rArr z b1 lowast b2 |z = (v1 v2)6

union tid = Ci τii isin Θ

Θ Γ ⊢ v lArr τ

Θ Γ ⊢ Cj v rArr z tid |z = Cj v7

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v1 rArr z1 int|ϕ1

Θ Γ ⊢ v2 rArr z2 int|ϕ2

Φ∆ ΓΘ ⊢ v1 + v2 rArr z3 int|z3 = v1 + v28

Θ ⊢b ΦΘ Γ ⊢b ∆val f (x b[ϕ]) rarr τ isin ΦΘ Γ ⊢ v lArr z b|ϕΦ∆ ΓΘ ⊢ f v rArr τ [vx]

9

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆

Φ∆ ΓΘ ⊢ u rArr τ10

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ fst v rArr z b1 |z = fst v11

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v rArr z b1 lowast b2 |ϕ

Φ∆ ΓΘ ⊢ snd v rArr z b2 |z = snd v12

Θ ⊢b ΦΘ Γ ⊢b ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ v lArr τ13

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ14

Φ∆ ΓΘ ⊢ e rArr z b|ϕΦ∆ Γ x b[ϕ[xz]]Θ ⊢ s lArr τ

Φ∆ ΓΘ ⊢ let x = e in s lArr τ15

u lt dom(δ )Θ Γ ⊢ v lArr τΦ∆ u τ ΓΘ ⊢ s lArr τ2

Φ∆ ΓΘ ⊢ var u τ = v in s lArr τ216

Θ ⊢b ΦΘ Γ ⊢b ∆u τ isin ∆Θ Γ ⊢ v lArr τ

Φ∆ ΓΘ ⊢ u = v lArr z unit|⊤17

Φ∆ ΓΘ ⊢ s1 lArr z unit|⊤Φ∆ ΓΘ ⊢ s2 lArr τ

Φ∆ ΓΘ ⊢ s1 s2 lArr τ18

Θ Γ ⊢ v rArr x bool|ϕ1

Φ∆ ΓΘ ⊢ s1 lArr z1 b|(v = T and (ϕ1[vx])) =rArr (ϕ[z1z])Φ∆ ΓΘ ⊢ s2 lArr z2 b|(v = F and (ϕ1[vx])) =rArr (ϕ[z2z])

Φ∆ ΓΘ ⊢ if v then s1 else s2 lArr z b|ϕ19

Φ∆ ΓΘ ⊢ s1 lArr z bool|⊤Φ∆ ΓΘ ⊢ s2 lArr z unit|⊤

Φ∆ ΓΘ ⊢ while (s1) do s2 lArr z unit|⊤20

union tid = Ci zi bi |ϕii isin Θ

Θ Γ ⊢ v rArr z tid |ϕΦ∆ Γ xi bi[ϕi[xizi] and v = Ci xi and (ϕ[vz])]Θ ⊢ si lArr τ

i

Φ∆ ΓΘ ⊢ match v of Ci xi rArr siilArr τ

21

Θ Γ ⊢b z1 b|ϕ1

Θ Γ ⊢b z2 b|ϕ2

P Γ z3 b[ϕ1[z3z1]] |= ϕ2[z3z1]Θ Γ ⊢ z1 b|ϕ1 ≲ z2 b|ϕ2

22

Θ Γ ⊢ v rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Θ Γ ⊢ v lArr z1 b|ϕ123

Φ∆ ΓΘ ⊢ e rArr z2 b|ϕ2

Θ Γ ⊢ z2 b|ϕ2 ≲ z1 b|ϕ1

Φ∆ ΓΘ ⊢ e lArr z1 b|ϕ124

Figure 5MiniSail Type System

the result of the operator For the case where e is fst (v1v2)or snd (v1v2) then the resulting statement is s with thee replaced by v1 or v2 For the case where e is a functionapplication f v then the reduction is to let x τ [vy] =

s prime[vy] in s where s prime is the body of the function f y is thefunction argument and τ is the return type of the functionWe substitute the value the function is applied to into boththe return type and the body of the function s prime This means

7

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 8: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

771

772

773

774

775

776

777

778

779

780

781

782

783

784

785

786

787

788

789

790

791

792

793

794

795

796

797

798

799

800

801

802

803

804

805

806

807

808

809

810

811

812

813

814

815

816

817

818

819

820

821

822

823

824

825

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

826

827

828

829

830

831

832

833

834

835

836

837

838

839

840

841

842

843

844

845

846

847

848

849

850

851

852

853

854

855

856

857

858

859

860

861

862

863

864

865

866

867

868

869

870

871

872

873

874

875

876

877

878

879

880

that any types in s prime (in var statements) will also undergosubstitution This step is conditional on f having an entryin the function context Φ

For the statement let x τ = s1 in s2 we have two rules ifs1 is a valuev then the reduction is to (δ s2[vx]) otherwisethe reduction is to (δ prime let x τ = s prime1 in s2) where (δ prime s prime1) isthe reduction of (δ s1)

For the statement if v then s1 else s2 we have a case forwhen v is T and one for when v is F For the statementwhile (s1) do s2 we unroll this to

let x z bool |T = s1 inif x then s2while (s1) do s2 else ()

For var u τ = v in s we extend the variable store with thepair (uv ) and the reduced statement is s For assignmentu = v we update the value of u in the store with v Notethat at this stage the value being assigned to the mutablevariable will have no immutable variables

For the sequence statement s1 s2 we have two rules If s1is the unit value then we reduce to s2 If s1 is not the unitvalue then we reduce to (δ prime s prime1 s2) where (δ

prime s prime1) is the resultof the reduction of (δ s1)

For the match statement

match v C1 x1 rArr s1 Cn xn rArr sn

statements where v is of the form C v prime and C matches oneof the Ci then the reduction is to (δ si [v primexi ])

4 Mechanisation in IsabelleIsabelle has been used previously to formalise a number ofprogramming languages For example Lightweight Java [19]andWebAssembly [24] Languages have also been formalisedusing other provers For example C in the Coq theoremprover [11] and CakeML using HOL4 [10]

In this section we describe how the grammar type systemand operational semantics of MiniSail are mechanised inIsabelle

41 General ApproachA key motivation for our mechanisation was our experiencewith proving soundness for MiniSail on paper We beganwith a (paper) proof outline following the usual path forlanguage soundness proofs but incorporating additional lem-mas as needed by novel aspects of the type system Howeverwe quickly realised that there was a lot of detail that neededto be tracked in the paper proof particularly around vari-ables and the structure of the various inductive cases Themechanisation was then begun with the intention that thiswould replace the paper proofs

At the high-level the main argument closely follows thestructure of the paper formalisation In addition to workingon the main argument there is also significant work provid-ing the necessary scaffolding in the form of technical lemmasthat donrsquot appear in paper proofs An example of one of these

lemmas is the weakening lemma for the lookup functionwitha well-formed Γ-context Spending time writing the technicallemmas is important for a number of reasons First we avoidhaving to re-prove similar facts in the proofs of the largerlemmas Second they act as a check of the definitions of datastructures and functions if we fail to prove a trivial technicallemma about a definition or the proof is more complex thanexpect then it can indicate a problem with the definitionand that the definition needs fixing or optimising Finallythese technical lemmas are available for use by Isabellersquosautomatic proof methods and sledgehammerAt the proof-level there was a lot of technical detail in-

volved in each proof Proving facts about freshness and ba-sic manipulation of contexts occurred frequently and thechoices around supporting data structures and functionshad to be made carefully to avoid making the proofs overlycomplex For example the Φ and Θ contexts given abovewere originally one context Π It was realised that when weadded sort checking for expressions we wanted a Π that hadno functions but could contain datatype definitions Thisrequired some messy filtering and associated lemmas andit was realised two separate contexts would be easier andmake things clearerMany theorem provers follow a backward reasoning pat-

tern - you start with the goal and the game is to pick a tacticthat reduces the current goal to a new set of goals and toiterate these steps until no goals are left All that is capturedin the text of the theory is the sequence of invocations ofthe tactics used - the proof scriptProofs using Isar are written closer to how paper mathe-

matics looks and will include not only the proof methods-tactics used but also the goals and subgoals explicitly writtenout in a hierarchical way This allowed us to use the highlevel outline of the proof from our paper formalisation (ora paper sketch if the paper formalism doesnrsquot include thelemma we are working on) as the starting point for manyproofs From the outline we then iteratively filled in the de-tail and when we got to a goal that we thought was simpleenough to be proved by Isabellersquos automatic proof methodsor by sledgehammer we invoked these to see if the goalcould be provedSledgehammer [14] is a tool in Isabelle that is able to

use external provers to provide a proof for goal Isabelle willthen reconstruct this proofmaking use of only proofmethodsavailable within Isabelle itself The sledgehammer mecha-nism is able to choose a set of facts from those available inthe current proof context to use in addition to those explicitlyspecified by the user However sledgehammer didnrsquot alwayscome up with a proof but in these cases it was clear whatlemmas needed to be used but not how to wire them up Of-ten the problem was that there was an error in the phrasingof the goal and a useful improvement to the automatic proofmethods and sledgehammer would be for Isabelle to report

8

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 9: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

881

882

883

884

885

886

887

888

889

890

891

892

893

894

895

896

897

898

899

900

901

902

903

904

905

906

907

908

909

910

911

912

913

914

915

916

917

918

919

920

921

922

923

924

925

926

927

928

929

930

931

932

933

934

935

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

936

937

938

939

940

941

942

943

944

945

946

947

948

949

950

951

952

953

954

955

956

957

958

959

960

961

962

963

964

965

966

967

968

969

970

971

972

973

974

975

976

977

978

979

980

981

982

983

984

985

986

987

988

989

990

to the user the reasons or possible reasons why the methodor sledgehammer failedThe presentation style of Isar whilst requiring each step

and goal to be explicitly spelled out1 does have some ad-vantages when it comes to handling changes to proofs Forexample when we introduced sort checking to the mecha-nisation it was clear from the Isabelle user interface whatproofs needed fixing With a proof-script style of reasoningthis information is hidden inside the reasoning engine andonly by attempting the reasoning steps would you see whatwas wrong from information shown in the user interface

Amajormilestonewas of course having everything provedAt this stage we are on stable platform and we can beginthe process of incrementally tidying up and optimising theproofs For the current version of the mechanisation moreoptimisation can be done in conjunction with building up abetter collection of technical lemmas

42 SyntaxAs with many formalisation of programming languages sub-stitution is an important operation In the operational se-mantics we use it to drive the reduction of let and matchstatements as well as function application Since MiniSailcontains binding statements we need to be careful abouthow substitution operates on these statements The nota-tion we use for substitution of a variable x for a value vin the statement s is s[x = v] Where the statement isa binding statement then there are a number of precondi-tions associated with the substitution In the substitution(letw = e in s )[x = v] we want to ensure that no freevariable inv is captured by the binding and if x = w then wewant the substitution to occur after renamingw in the state-ment to some other variable In paper mathematics theserequirements are handled by the convention that we are freeto replace a term by one that is α-equivalent to it obtained byrenaming bound variables with a new name In mechanicalformalisations there has to be something in the formalisationto handle thisThere are a variety of approaches to this including De

Bruijn indices [5] and the locally nameless representation [3]The approach that we adopted is to make use of Nominallogic ideas [16] as realised in the Isabelle Nominal2 package[8 20] The use of Nominal is not usually the first choice andpartially justified in that we want to write the formal proofsin a way that matches the paper proofs We also wanted totake this opportunity to see how Nominal-Isabelle works ina language formalisationThe following is an outline of Nominal Isabelle as it per-

tains to this formalisation The basic building block in Nom-inal Isabelle is the multi-sorted atom which correspond to

1This effort can be reduced a little by using experimental tools such aslsquoexplorersquo introduced on the Isabelle mailing list httpswwwmail-archivecomisabelle-devmailbroyinformatiktu-muenchendemsg04443html

variables Atoms can be declared as part of the structure ofterms and in MiniSail we have one sort of atoms for mutablevariables and one for immutable variables A basic operationon terms is the operation of permuting atoms For examplein the syntax of Nominal-Isabelle we have

(z harr w ) bull (let x = 1 in z) = (let x = 1 inw )

where (z harr w ) is the permutation that swaps the two atomsz andw and bull is the permutation application operator Fromthis is defined the support for a term which is equivalentin our case to the notion of free variables and from this thedefinition of freshness written as lsquoatom x vrsquo is defined asnon-membership of the support of the term v Nominal-Isabelle uses these concepts to automatically

generate from provided binding information definitionsof α-equivalence for each sort of term For example thestatements let x1 = e1 in s1 and let x2 = e2 in s2 are α-equivalent if e1 = e2 and for any c (x1 s1x2 s2) we have(c harr x1) bull s1 = (c harr x2) bull s2We represent the MiniSail grammar AST as a set of nom-

inal datatypes These are like conventional datatypes butallow us to specify the binding structure of terms and willalso trigger Isabelle to generate the freshness support and α-equivalence facts as described above Isabelle will constructan internal representation for the datatype and quotient itusing α-equivalence to give the surface datatype

nominal_datatype case_s = mdash StatementsC_final dc xx ss binds x in s

| C_branch dc xx ss case_s binds x in s

and s =

AS_val v

| AS_let xx e ss binds x in s

| AS_let2 xx τ s ss binds x in s

| AS_if v s s

| AS_var uu τ v ss binds u in s

| AS_assign u v

| AS_case v case_s

| AS_while s s

| AS_seq s s

Figure 6 Statement Datatype

Figure 6 shows the nominal datatype definition for state-ments The ldquobindsrdquo keyword is used to indicate the bindingstructure for the let let2 and var statements The datatypefor the branches of the case statement are also defined hereand we enforce that there is at least one branch

An important property of functions that is used in proofsis equivariance permuting atoms on the result of a functionis equal to applying the function to permuting atoms on thearguments Informally equivariance means that the functionis well-behaved with respect to α-equivalence Equivariancealso extends to propositions and inductive predicates includ-ing typing judgements When defining nominal functions

9

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 10: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

991

992

993

994

995

996

997

998

999

1000

1001

1002

1003

1004

1005

1006

1007

1008

1009

1010

1011

1012

1013

1014

1015

1016

1017

1018

1019

1020

1021

1022

1023

1024

1025

1026

1027

1028

1029

1030

1031

1032

1033

1034

1035

1036

1037

1038

1039

1040

1041

1042

1043

1044

1045

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1046

1047

1048

1049

1050

1051

1052

1053

1054

1055

1056

1057

1058

1059

1060

1061

1062

1063

1064

1065

1066

1067

1068

1069

1070

1071

1072

1073

1074

1075

1076

1077

1078

1079

1080

1081

1082

1083

1084

1085

1086

1087

1088

1089

1090

1091

1092

1093

1094

1095

1096

1097

1098

1099

1100

Isabelle generates a proof obligation for equivariance thatwe need to prove In most cases these are easy to prove

The first set of functions that we define are the impor-tant substitution functions for the language terms Nomi-nal Isabelle allows us to encode the capture avoiding prop-erty substitution that we need by allowing the specifica-tion of a freshness condition for a branch of the function asshown in Figure 7 We prove a number of lemmas related

nominal_function subst_t x rArr v rArr τ rArr τ whereatom z ♯ (xv) =rArr subst_t x v | z b | c | = | z

b | c[x=v]c |

Figure 7 Substitution Function For Types

permutation of variables to substitution using the lemmasfrom [15] as a guide An example is that substituting in avariable is the same as permutation provided y is fresh in t (x harr y) bull t = t[x = y] As program variables can appearin types we define substitution for types When we definesubstitution for statements we include in the clause for thevar statement substitution into the type annotation Thestructures for contexts Φ Π Γ and ∆ are defined using listor list-like datatypes We define substitution for ∆ which isjust a mapping of the substitution function over the types forthe mutable variables Substitution in Γ Γ[xv] may resultin the lsquocollapsersquo of Γ if x has an entry in Γ Finally for thesyntax part we define the inductive predicates capturing thewell-scoping and sort checking rules

43 SMT ModelAs mentioned above the subtyping check boils down toa validity check in a logic supported by the SMT solverZ3 In order to prove lemmas about subtyping for exampletransitivity substitution and weakening we need to build amodel of the SMT logic and setup definitions and supportinglemmas Themodel also allows us to prove specific subtypingrelations between types used by the progress lemmaThe logic is multi-sorted and we can reuse the base type

datatype to define the sorts for variables and terms in thislogic We first define a datatype representing the values inthis logic and define the type of valuation that is a mapof variables to values To ensure that the valuation mapsvariables to the value of the appropriate sort we define anotion of well-formedness for valuations with respect tothe contexts Θ and Γ This well-formedness condition alsoensures that all the variables in Γ are given a value by thevaluation We then define a series of evaluation relations forliteral values expressions and constraints that define howthese terms are evaluated by a valuation Importantly weprove an existence lemma that says that there is a value fora term under a valuation if the term is well-sorted and thevaluation is well-formed We will then prove later that typesynthesis for values and expressions produce well-formed

constraints and so have shown that synthesised types fallwithin the decidable logic fragment

Building on evaluation of constraints we define satisfac-tion of a constraint in Θ and Γ contexts and then validity Weprove strengthening weakening and substitution propertiesfor validity

44 TypingIn this subsectionwe describe how the typing rules presentedin Figure 5 are translated into inductive predicates in IsabelleFor the most part the Isabelle rules are a transcription

from the paper formalisation into Isabelle The only differ-ence is that we need to add freshness conditions to thepremises of some of the rules For example for the typesynthesis rules for values we need to ensure that z appear-ing in a synthesised type does not also appear in v or thecontext ΓEquivariance for the predicate is proved automatically

and induction rules are generated For nominal predicatesand datatypes we also have strong induction rules gener-ated that we can use in nominal inductive proofs to ensurethat freshness conditions are automatically built into theproof We also instruct Isabelle to generate various forms ofelimination rules that we can use in proofs

For proving lemmas about the typing judgements we canuse induction over the syntax of the term sort that we areproving the lemma for or induction over the structure ofthe derivation making use of the induction rules generatedfrom the typing predicate The latter technique is preferablehowever for statements it initially took some care to con-struct the lemma correctly and to invoke the required proofmethodWe prove lemmas that assure us that synthesised types

are well-formed and are unique up to alpha-equivalence Fur-thermore we prove weakening This tells us that a judgementcontinues to hold if a context is replaced with a larger one- when considered as a set the larger is a superset of thesmaller and that the larger is also well-scoped We do thisfor the Γ and ∆ contexts

45 Substitution LemmasThe key enabling lemmas are the substitution lemmas pri-marily the substitution lemma for statements which is shownin Figure 8 The substitution lemma ensures that reductionsteps using substitution preserve the type over the reduc-tion Before being able to prove this lemma for statementswe need to prove it for expressions and values and sincesubstitution occurs into types as well we need to prove lem-mas involving substitution for types subtyping and validityPrior to this we need to prove a set of narrowing lemmasThese are lemmas telling us that if a variable x has an en-try x b[ϕ] in a context Γ and we replace ϕ with a morerestrictive constraint in Γ then type checking and synthesis

10

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 11: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

1101

1102

1103

1104

1105

1106

1107

1108

1109

1110

1111

1112

1113

1114

1115

1116

1117

1118

1119

1120

1121

1122

1123

1124

1125

1126

1127

1128

1129

1130

1131

1132

1133

1134

1135

1136

1137

1138

1139

1140

1141

1142

1143

1144

1145

1146

1147

1148

1149

1150

1151

1152

1153

1154

1155

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1156

1157

1158

1159

1160

1161

1162

1163

1164

1165

1166

1167

1168

1169

1170

1171

1172

1173

1174

1175

1176

1177

1178

1179

1180

1181

1182

1183

1184

1185

1186

1187

1188

1189

1190

1191

1192

1193

1194

1195

1196

1197

1198

1199

1200

1201

1202

1203

1204

1205

1206

1207

1208

1209

1210

lemma subst_infer_check_s

fixes vv and ss and cscase_s and xx and cc and bb and Γ1Γ and Γ2Γassumes Θ Γ1 ⊢ v rArr τ and Θ Γ1 ⊢ τ ≲ | z b | c | and atom z ♯ (x v)

shows Φ ∆ Γ Θ ⊢ s lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ ⊢ s[x=v]s lArr τrsquo[x=v]τ andΦ ∆ Γ Θ tid ⊢ cs lArr τrsquo =rArr Γ = (Γ2((xbc[z=V_var x]c)Γ1)) =rArr

Φ ∆[x=v]∆ Γ[x=v]Γ Θ tid ⊢ cs[x=v]s lArr τrsquo[x=v]τ

Figure 8 Substitution Lemma for Statements

judgements for statements expressions and values will holdif they hold for the original ΓTo prove the substitution lemma for statements we use

rule induction and outline now how this proceeds for thelet statement The typing rule for the let statement is shownin Figure 9 Rule induction will invert the statement of thelemma and the premises for the rule will be added as premisesfor the case with the appropriate instantiation of variablesOur task is to apply the typing rule for the let statements inthe forward direction to show substitution is true for the letstatement To get to the point where we can do this thereare some steps we need to dobull Apply the substitution lemma for expressions to e toobtain a subtype of the type for ebull Use the induction hypothesis to get the substitutionfor s bull Apply the narrowing lemma for the type of e in thecontextbull Prove the required freshness conditions

The first three steps are shared with a paper proof howeverthe last is notA common proof pattern we encountered was that in

order to satisfy a freshness condition we needed to obtaina new variable that is fresh for a set of terms and contextsObtaining the new variable is a one-liner but the tricky stepis proving facts that hold for terms containing the originalvariable continue to hold in some way for terms with thefresh variable For example in the weakening lemma forthe let statement let x = e in s the typing rule for thisrequires that x does not occur in the context Γ Howeverwith weakening there is no restriction on what might havebeen added to the expanded context and this might havebeen x Therefore we have to prove weakening for an α-equivalent let statement where the bound variable does notoccur in the context

46 Progress Preservation and SafetyWe finish the formalisation with proofs of progress preser-vation and finally safety The statement of these as Isabelletext is shown in Figure 10

For type preservation we donrsquot just prove preservation ofthe type of the statement under reduction but we also needto ensure that the types of the values in the possibly updated

mutable storeδ remain compatible with the∆ context To thisend we introduce a new typing judgement ΦΘ ⊢ (δ s ) lArrτ ∆ This judgement holds if we have ΦΘ ϵ ∆ ⊢ s lArr τand for every (uv ) in δ if there is a pair (uτ ) isin ∆ thenΘ ϵ ⊢ v lArr τ

We extend the preservation lemma to handle multi-step re-ductions (indicated by minusrarrlowast) The progress lemma states thatif a statement and store are well-typed then the statementis either a value or a reduction step can be made Finally weprove the type safety lemma for MiniSail that says that if s iswell-typed and we can multi-step reduce s to s prime then eithers prime is a value or we can make another step

These final lemmas here are starting to look like the paperproofs as we have by this stage established a foundation withthe substitution lemmas and have left the realm of syntaxand typing

5 Experience and DiscussionThe development of the MiniSail has fed back into the de-velopment of the full Sail language Early versions of Sailhad an ad-hoc hand-written constraint solver which wasoften incapable of proving all constraints found in our ISAspecifications By co-developing the MiniSail formalisationwith re-development of the full Sail type checker we wereable to guarantee the decidability of all constraints while si-multaneously increasing the expressiveness of the constraintlanguageNot only has the MiniSail formalisation significantly in-

creased our confidence in the robustness of our type systemdesign the use of an off-the-shelf constraint solver (Z3) com-bined with the bi-directional type-checking approach used inMiniSail has increased the performance of the Sail languagesignificantly before starting our MiniSail formalisation afragment of ARMv8 specified in Sail would take several min-utes to type-check due to pathological cases in the ad-hocconstraint solver In recent versions type-checking the entire64-bit fragment of the ARMv8 instruction set can be done inunder 3 seconds on a modern computer (Intel i7-8700)

We have both a paper formalisation of MiniSail presentedin [2] and a mechanised formalisation presented in this paperand so there is an opportunity to compare the two The firstpoint of comparison is in the lengths of the two works Thepaper formalisation take 85 pages of typeset mathematics

11

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 12: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

1211

1212

1213

1214

1215

1216

1217

1218

1219

1220

1221

1222

1223

1224

1225

1226

1227

1228

1229

1230

1231

1232

1233

1234

1235

1236

1237

1238

1239

1240

1241

1242

1243

1244

1245

1246

1247

1248

1249

1250

1251

1252

1253

1254

1255

1256

1257

1258

1259

1260

1261

1262

1263

1264

1265

PLrsquo18 January 01ndash03 2018 New York NY USA Mark Wassell Alasdair Armstrong Neel Krishnaswami Peter Sewell

1266

1267

1268

1269

1270

1271

1272

1273

1274

1275

1276

1277

1278

1279

1280

1281

1282

1283

1284

1285

1286

1287

1288

1289

1290

1291

1292

1293

1294

1295

1296

1297

1298

1299

1300

1301

1302

1303

1304

1305

1306

1307

1308

1309

1310

1311

1312

1313

1314

1315

1316

1317

1318

1319

1320

check_letI [[atom x ♯ (zcτΓ) atom z ♯ Γ

Φ ∆ Γ Θ ⊢ e rArr | z b | c |

Φ ∆ ((xbc[z=V_var x]c)Γ) Θ ⊢ s lArr τ ]] =rArrΦ ∆ Γ Θ ⊢ (AS_let x e s) lArr τ

Figure 9 Typing rule for Let statement

lemma progress

fixes ss

assumes Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆

shows (exist v s = AS_val v) or (exist δrsquo srsquo Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩)

lemma preservation

fixes ss and srsquos and cs case_s

assumes Φ ⊢ ⟨ δ s ⟩ minusrarr ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows exist∆rsquo Φ Θ ⊢ ⟨ δrsquo srsquo ⟩ lArr τ ∆rsquo and set ∆ sube set ∆rsquo

lemma safety

assumes Φ ⊢ ⟨ δ s ⟩ minusrarrlowast ⟨ δrsquo srsquo ⟩ and Φ Θ ⊢ ⟨ δ s ⟩ lArr τ ∆shows (exist v srsquo = AS_val v) or (exist δrsquorsquo srsquorsquo Φ ⊢ ⟨ δrsquo srsquo ⟩ minusrarr ⟨ δrsquorsquo srsquorsquo ⟩)

Figure 10 Preservation Progress and Safety Lemmas

with associated commentary and the Isabelle mechanisationtakes 402 pages of typeset Isabelle code The Isabelle formal-isation also took around 4 times as long to complete as thepaper proofs This included the initial mechanisation thatwas using the paper proofs as a starting point and the extratime to add the sort-checking that isnrsquot in the paper proofs

To decrease the effort better tools and practices are neededto remove some of the boilerplate proof code and the repet-itive nature of the proofs Ott [18] was used to typeset thetyping rules and operational semantics for the paper proofsOtt can be used to specifying binding structure and doesexport to Isabelle (but not in the Nominal style) So futurework is to enhance Ott and to see if can be used to tame theboilerplate similar to the work described in [9]A feature of paper proofs is that the author has greater

control over the structure of the proof and can ensure that themain thread of the argument is made clear and that ancillarydetail is elided With a mechanised proof the detail has to bepresent somewhere The challenge in this mechanisation hasbeen to ensure that the detail particularly the detail aboutfreshness which wouldnrsquot appear in the paper proof doesnrsquotswamp or hide the main argument Isabelle provides a meansto structure proofs and this has been taken advantage of inmost places in the mechanisation However some more workis required to clean up some of the proofs possibly with thehelp of suitable technical lemmas that will enable Isabelleautomatic proof methodsAnother point of comparison between paper and mecha-

nised proofs is around the level of abstraction In the paperproofs we deal directly with the language syntax whereasin the mechanisation we have to go through datatypes thatrepresent the syntax When constructing the datatypes we

can choose to have a datatype constructor corresponds toa single production rule in the syntax or we can abstractand have the datatype represent a number of similar rulesand have some parameterisation mechanism For examplein the Isabelle mechanisation for the binary operators + andle we have a single constructor for the binary operator andparameterise over the operator

In the definition of the typing predicate we give one rulefor + and one for le The impact of this choice is how thecases for the inductive proofs look when we do inductionover the AST or the rules If we go for a match between ASTand grammar such as with the fst and snd operators thenthere will be a overlap in the proof text for the cases for theseoperators The proofs follow the same overall pattern withdifference being in how the value and type are projected outIn a paper proofs the author can present the proof for justone case and appeal to proof by analogy for the other in themechanisation the analogy has to be made explicitThere are number of avenues to explore making use of

this work and the experience The first is to investigate if themechanisation can be used to generate an implementation ofa type checking and interpreter for MiniSail Code exportingis a well used feature of Isabelle however the challenge hereis that code exporting for Nominal-Isabelle has not beensolved Second formalise a larger subset of Sail either withinthe Nominal framework or use another representation Inthe case of the latter prove some sort of correspondence

AcknowledgmentsThisworkwas partially supported by EPSRC grant EPK0085281(REMS) We also thank Thomas Bauereiss for his suggestionson the Isabelle code

12

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References
Page 13: Mechanised Metatheory for the Sail ISA Specification Languagenk480/sail-isabelle.pdf · PL’18, January 01–03, 2018, New York, NY, USA 2018. comprehensive enough to run the model

1321

1322

1323

1324

1325

1326

1327

1328

1329

1330

1331

1332

1333

1334

1335

1336

1337

1338

1339

1340

1341

1342

1343

1344

1345

1346

1347

1348

1349

1350

1351

1352

1353

1354

1355

1356

1357

1358

1359

1360

1361

1362

1363

1364

1365

1366

1367

1368

1369

1370

1371

1372

1373

1374

1375

Mechanised Metatheory for the Sail ISA Specification Language PLrsquo18 January 01ndash03 2018 New York NY USA

1376

1377

1378

1379

1380

1381

1382

1383

1384

1385

1386

1387

1388

1389

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1401

1402

1403

1404

1405

1406

1407

1408

1409

1410

1411

1412

1413

1414

1415

1416

1417

1418

1419

1420

1421

1422

1423

1424

1425

1426

1427

1428

1429

1430

References[1] ARM 2015 ARM Architecture Reference Manual (ARMv8 for ARMv8-A

architecture profile) ARM ARM DDI 0487Ah (ID092915)[2] Alasdair Armstrong Thomas Bauereiss Brian Campbell Alastair

Reid Kathryn E Gray Robert M Norton Prashanth Mundkur MarkWassell Jon French Christopher Pulte Shaked Flur Ian Stark NeelKrishnaswami and Peter Sewell 2019 ISA Semantics for ARMv8-A RISC-V and CHERI-MIPS Proc ACM Program Lang httpwwwclcamacuk~pes20sail

[3] Arthur Chargueacuteraud 2011 The Locally Nameless RepresentationJournal of Automated Reasoning 49 (2011) 363ndash408 httpsdoiorg101007s10817-011-9225-2

[4] Stefan Berghofer Christian Urban and Cezary Kaliszyk 2013 Nominal2 Archive of Formal Proofs (Feb 2013) httpswwwisa-afporgentriesNominal2html Nominal 2

[5] NG de Bruijn 1972 Lambda calculus notation with nameless dummiesa tool for automatic formula manipulation with application to theChurch-Rosser theorem Indagationes Mathematicae (Proceedings) 755 (1972) 381 ndash 392 httpsdoiorg1010161385-7258(72)90034-0

[6] Leonardo De Moura and Nikolaj Bjoslashrner 2008 Z3 An Efficient SMTSolver In Proceedings of the Theory and Practice of Software 14th Inter-national Conference on Tools and Algorithms for the Construction andAnalysis of Systems (TACASrsquo08ETAPSrsquo08) Springer-Verlag Berlin Hei-delberg 337ndash340 httpdlacmorgcitationcfmid=17927341792766

[7] Joshua Dunfield and Neelakantan R Krishnaswami 2013 Completeand Easy Bidirectional Typechecking for Higher-Rank PolymorphismIn International Conference on Functional Programming (ICFP) arXiv13066032[csPL]

[8] Brian Huffman and Christian Urban 2010 A New Foundation forNominal Isabelle In Proceedings of the First International Conferenceon Interactive Theorem Proving (ITPrsquo10) Springer-Verlag Berlin Hei-delberg 35ndash50 httpsdoiorg101007978-3-642-14052-5_5

[9] Steven Keuchel Stephanie Weirich and Tom Schrijvers 2016 Needleamp Knot Binder Boilerplate Tied Up In Proceedings of the 25th EuropeanSymposium on Programming Languages and Systems - Volume 9632Springer-Verlag New York Inc New York NY USA 419ndash445 httpsdoiorg101007978-3-662-49498-1_17

[10] Ramana Kumar Magnus O Myreen Michael Norrish and Scott Owens2014 CakeML A Verified Implementation of ML In Principles ofProgramming Languages (POPL) ACM Press 179ndash191 httpsdoiorg10114525358382535841

[11] Xavier Leroy Sandrine Blazy Daniel Kaumlstner Bernhard SchommerMarkus Pister and Christian Ferdinand 2016 CompCert - A FormallyVerified Optimizing Compiler In ERTS 2016 Embedded Real TimeSoftware and Systems 8th European Congress SEE Toulouse Francehttpshalinriafrhal-01238879

[12] Tobias Nipkow Markus Wenzel and Lawrence C Paulson 2002 Is-abelleHOL A Proof Assistant for Higher-order Logic Springer-VerlagBerlin Heidelberg

[13] Ulf Norell 2009 Dependently Typed Programming in Agda In Pro-ceedings of the 6th International Conference on Advanced FunctionalProgramming (AFPrsquo08) Springer-Verlag Berlin Heidelberg 230ndash266httpdlacmorgcitationcfmid=18133471813352

[14] Lawrence C Paulson 2010 Three Years of Experience with Sledge-hammer a Practical Link between Automatic and Interactive TheoremProvers In Proceedings of the 2nd Workshop on Practical Aspects ofAutomated Reasoning PAAR-2010 Edinburgh Scotland UK July 142010 1ndash10 httpwwweasychairorgpublicationspaper52675

[15] Lawrence C Paulson 2013 Goumldelrsquos Incompleteness TheoremsArchive of Formal Proofs 2013 (2013) httpswwwisa-afporgentriesIncompletenessshtml

[16] Andrew M Pitts 2003 Nominal logic a first order theory of namesand binding Information and Computation 186 2 (2003) 165 ndash 193httpsdoiorg101016S0890-5401(03)00138-X Theoretical Aspects ofComputer Software (TACS 2001)

[17] Patrick M Rondon Ming Kawaguci and Ranjit Jhala 2008 LiquidTypes In Proceedings of the 29th ACM SIGPLAN Conference on Pro-gramming Language Design and Implementation (PLDI rsquo08) ACM NewYork NY USA 159ndash169 httpsdoiorg10114513755811375602

[18] Peter Sewell Francesco Zappa Nardelli Scott Owens Gilles PeskineThomas Ridge Susmit Sarkar and Rok Strniša 2007 Ott EffectiveTool Support for the Working Semanticist In Proceedings of the 12thACM SIGPLAN International Conference on Functional Programming(ICFP rsquo07) ACM New York NY USA 1ndash12 httpsdoiorg10114512911511291155

[19] Rok Strniša and Matthew Parkinson 2011 Lightweight Java Archiveof Formal Proofs (Feb 2011) httpswwwisa-afporgentriesLightweightJavahtml Lightweight Java

[20] Christian Urban and Christine Tasson 2005 Nominal Techniques inIsabelleHOL In Proceedings of the 20th International Conference onAutomated Deduction (CADErsquo 20) Springer-Verlag Berlin Heidelberg38ndash53 httpsdoiorg10100711532231_4

[21] Niki Vazou Eric L Seidel Ranjit Jhala Dimitrios Vytiniotis and SimonPeyton-Jones 2014 Refinement Types for Haskell SIGPLAN Not 499 (Aug 2014) 269ndash282 httpsdoiorg10114526929152628161

[22] Robert N Watson et al 2017 Capability Hardware Enhanced RISC In-structions (CHERI) httpwwwclcamacukresearchsecurityctsrdcheri

[23] Robert N MWatson JonathanWoodruff Peter G Neumann SimonWMoore Jonathan Anderson David Chisnall Nirav H Dave BrooksDavis Khilan Gudka Ben Laurie Steven J Murdoch Robert NortonMichael Roe Stacey D Son and Munraj Vadera 2015 CHERI AHybrid Capability-System Architecture for Scalable Software Com-partmentalization In 2015 IEEE Symposium on Security and Privacy SP2015 San Jose CA USA May 17-21 2015 20ndash37 httpsdoiorg101109SP20159

[24] Conrad Watt 2018 Mechanising and Verifying the WebAssemblySpecification In Proceedings of the 7th ACM SIGPLAN InternationalConference on Certified Programs and Proofs (CPP 2018) ACM NewYork NY USA 53ndash65 httpsdoiorg1011453167082

13

  • Abstract
  • 1 Introduction
  • 2 Sail
  • 3 MiniSail
    • 31 Kernel Language Design
    • 32 Syntax
    • 33 Typing Judgements and Contexts
    • 34 Sort Checking
    • 35 Subtyping and Validity
    • 36 Typing Rules
    • 37 Operational Semantics
      • 4 Mechanisation in Isabelle
        • 41 General Approach
        • 42 Syntax
        • 43 SMT Model
        • 44 Typing
        • 45 Substitution Lemmas
        • 46 Progress Preservation and Safety
          • 5 Experience and Discussion
          • Acknowledgments
          • References