272: software engineering fall 2012 instructor: tevfik bultan lecture 3: modular verification with...

272: Software Engineering Fall 2012

Instructor: Tevfik Bultan

Lecture 3: Modular Verification with Magic, Predicate Abstraction

Modular verification with Magic

• MAGIC: Modular Analysis of proGrams In C• Goal: Automated verification of C programs against finite state

machine specifications (given as labeled transition systems)– Checks that the behavior of the C program conforms to the

behavior of the state machine

• It is a modular verification approach, the decomposition of the verification task follows the modularity in the code– The procedure that is being analyzed can invoke other

procedures which are themselves specified as state machines

• It uses predicate abstraction for automatically generating procedure abstractions and then checks conformance of the extracted procedure abstraction to the specification

• It uses the abstract-verify-refine approach– If the conformance check fails, the procedure abstraction can be

refined

Labeled transition systems as specifications

• A labeled transition system (LTS) M is a 4-tuple (S, S0, Act, T) where

– S is a finite, non-empty set of states

– S0 S is the set of initial states

– Act is the set of actions– T S × Act × S is the transition relation

• Assume that there is a special type of state called STOP state. – A STOP state has no outgoing transitions

• (s, a, s’) T is also written as s →a s’

• s a s’ means that s’ is reachable from s by following only a single a-transition and arbitrary number of ε-transitions– ε is a specific type of action in Act. It corresponds to a silent action

(like skip)

Example LTS

• There is a textual language (called Finite State Processes, FSP) for specifying labeled transition systems

• For the above LTS, the FSP specification would be:

MyLock = { lock -> return {$0 == 0} -> STOP

| return {$0 == 1} -> STOP } .

STOPMyLock

lock

return[0]

return[1]

An example LTS and an example procedure

• The goal is to check the conformance between the C procedures and the specification LTSs

STOPMyLock

lock

return[0]

return[1]

int proc(){ if (do_lock()) return 0; else return 1;}

Procedure Abstractions

• They define a procedure abstraction (PA) as a set of LTSs.

• A PA is a tuple <d, l> where – d is the declaration for the procedure (as it appears in a C header

file)

– l is a finite list <g1, M1> , …, <gn, Mn> where each gi is a guard formula ranging over the parameters of the procedure and each Mi is an LTS with a single initial state

• The guards are mutually exclusive

• A PA is an abstraction of a procedure, if, for all i between 1 and n, when the guard gi evaluates to true over the actual parameters passed to the procedure, the procedure conforms to the LTS Mi

Procedure Abstractions

• Procedure abstractions serve two purposes

1. They are used to specify desired behavior of the procedures• They present automated extraction techniques to

automatically extract a PA from a given procedure

2. They are used to achieve modular verification• During verification of a procedure, the behaviors of

procedures that are called by that procedure are abstracted as PAs

Conformance as Weak Simulation

• Once a PA is extracted from a given procedure, then we want to check if the extracted PA conforms to the given LTS specification

• In order to do this we need to formalize what it means to “conform” to a given LTS specification

• They do this by using weak simulation

• Weak simulation preservers LTLX properties

– LTLX is the temporal logic LTL without the next state operator X

– So,

1. if we verify an LTLX property on the specification LTS, and

2. show that the procedure conforms to the specification LTS, then

3. we can conclude that the procedure also satisfies the LTL property

Conformance as Weak Simulation

• Given two LTSs M = (S, S0, Act, T) and M’ = (S’, S0’, Act, T’)

• M’ weakly simulates M if and only if there exists a weak simulation relation E S × S’ such that

1. For all s S0 there exists an s’ S0’ such that (s, s’) E

2. (s, s’) E implies that for all actions a Act \ {ε}

if s a s1 then there exists an s1’ S0’ such that

s’ a s1’ and (s1, s1’) E

Weak Simulation

• The existence of a simulation relation between two labeled transition systems can be checked by reducing the problem to an instance of Boolean satisfiability

• Due to the specific structure of the SAT instances produced in this reduction, satisfiability of the resulting SAT instance can be solved in linear time.

• Weak simulation is the conformance criteria that is used in Magic:– A procedure conforms to an LTS if the LTS can weakly simulate

the procedure– This means that the implementation (the C procedure) is safely

abstracted by its specification (the LTS)

Weak Simulation

• Weak simulation is the conformance criteria that is used in Magic:– A procedure conforms to an LTS if the LTS can weakly simulate

the procedure– This means that the implementation (the C procedure) is safely

abstracted by its specification (the LTS)

Overall Approach

Given a specification Mspec for a procedure

• First, extract Mimp which abstracts the behavior of the procedure

– During the abstraction process, the procedures that are called by the procedure that is being analyzed are modeled using a set of given procedures abstractions (which are called assumption PAs)

– The procedure abstraction is automatically generated using the given assumption PAs and predicate abstraction

• Then, check if Mimp conforms to Mspec (via weak simulation)

– If Mimp conforms to Mspec then verification is successful and we are done

– If Mimp does not conform to Mspec then we check the cause for non-conformance

• If it is a bug in the implementation, then we found an error and we are done

• If it is not a bug, but non-conformance is due to imprecision in the abstraction Mimp, then refine Mimp and repeat the process

Model Extraction

Extraction of Mimp relies on the following principles:

• Every state of Mimp models a state during execution of the procedure, so every state is composed of a control component and a data component

• The control components intuitively represent the values of the program counter and are formally obtained from the CFG

• The data components are abstract representations of the memory state of the procedure and are obtained using predicate abstraction

• The transitions between states of the Mimp are derived from the transitions in the control flow graph taking into account the assumption PAs and the predicate abstraction

Inlining assumption PAs

• During the model extraction, assumption PAs are used to handle procedure calls

• If the procedure that is being abstraction calls another procedure p, then the PA for p is inlined by – creating a copy of the LTS for p– inserting an ε-transition from the call location to the initial state of

the LTS for p– inserting ε-transitions from the STOP states of the LTS for p to the

statement right after the call statement

Experiments with MAGIC

• OpenSSL if an open source implementation of the publicly available SSL specification– SSL protocol is used by a client (typically a web browser) and a

server to establish a secure socket connection over a malicious network using public and symmetric key cryptography

• A critical component of the protocol is the handshake• Check if the openssl-0.9.6c implementation of the server side

handshake conforms to its specification– Implementation is encapsulated in a single procedure with 347

lines of C code

– They wrote the Mspec manually (an LTS with 28 states and 67 transitions)

• Check if the client-side implementation conforms to the specification– Implementation is encapsulated in a single procedure with 345

lines of C code

– Mspec is an LTS with 28 states and 60 transitions

Experiments with MAGIC

• They provided 18 predicates for abstraction and provided the PAs for 12 library routines

• Server-side verification took 255 seconds and 130MB of memory

• Client-side verification took 226 seconds and 107MB of memory

• They then changed the specification model to see if their approach can catch errors– Server-side error was found in 247 seconds using 130MB of

memory– Client-side error was found in 227 seconds using 11MB of

memory

Predicate Abstraction

• In the following slides I will give an overview of the predicate abstraction technique

Abstraction (A simplified view)

• How do we generate an abstract transition system?

• Merge states in the concrete transition system (based on some criteria)– This reduces the number of states, so it should be easier to do

verification

• Do not eliminate transitions– This will make sure that the paths in the abstract transition system

subsume the paths in the concrete transition system


• For every path in the concrete transition system, there is an equivalent path in the abstract transition system– If no path in the abstract transition system violate a property, then

no path in the concrete system can violate the property

• Using this reasoning we can verify properties in the abstract transition system– If the property holds on the abstract transition system, we are

sure that the property holds in the concrete transition system– If the property does not hold in the abstract transition system, then

we are not sure if the property holds or not in the concrete transition system


• If the property does not hold in the abstract transition system, what can we do?

• We can refine the abstract transition system (split some states that we merged)

• We have to make sure that the refined transition system is still an abstraction of the concrete transition system

• Then, we can recheck the property again on the refined transition system– If the property does not hold again, we can refine again


• An automated abstraction technique which can be used to reduce the state space of a program

• The basic idea in predicate abstraction is to remove some variables from the program by just keeping information about a set of predicates about them

• For example a predicate such as x = y maybe the only information necessary about variables x and y to determine the behavior of the program– In that case we can just store a boolean variable which

corresponds to the predicate x = y and remove variables x and y from the program

– Predicate abstraction is a technique for doing such abstractions automatically


• Given a program and a set of predicates, predicate abstraction abstracts the program so that only the information about the given predicates are preserved

• The abstracted program adds nondeterminism since in some cases it may not be possible to figure out what the next value of a predicate will be based on the predicates in the given set

• One needs an automated theorem prover to compute the abstraction

Predicate Abstraction, A Very Simple Example• Assume that we have two integer variables x,y

• We want to abstract the program using a single predicate “x=y”

• We will divide the states of the program to two:

1. The states where “x=y” is true

2. The states where “x=y” is false, i.e., “xy”

• We will then merge all the states in the same set– This is an abstraction– Basically, we forget everything except the value of the predicate

“x=y”

Predicate Abstraction, A Very Simple Example• We will represent the predicate “x=y” as the boolean variable B in the

abstract program – “B=true” will mean “x=y” and – “B=false” will mean “xy”

• Assume that we want to abstract the following program which contains only one statement:

y := y+1

Predicate Abstraction, Step 1• Calculate preconditions based on the predicate

y := y + 1 {x = y}

y := y + 1 {x y} {x y + 1}

{x = y + 1}

precondition for B being false afterexecuting the statement y:=y+1

precondition for B being true afterexecuting the statement y:=y+1

Using our temporal logic notationwe can say something like:{x=y+1} AX{x=y}

Again, using our temporal logic notation:{x≠y+1} AX{x≠y}

Predicate Abstraction, Step 2• Use decision procedures to determine if the predicates used for

abstraction imply any of the preconditions

x = y x = y + 1 ? No

x y x = y + 1 ? No

x = y x y + 1 ? Yes

x y x y + 1 ? No

Predicate Abstraction, Step 3• Generate abstract code

IF B THEN B := false ELSE B := true | false

y := y + 1

Predicate abstraction wrt the predicate “x=y”

y := y + 1 {x = y}

y := y + 1 {x y} {x y + 1}

{x = y + 1}

1) Computepreconditions

x = y x = y + 1 ? No

x y x = y + 1 ? No

x = y x y + 1 ? Yes

x y x y + 1 ? No2) Checkimplications

3) Generateabstract code

Checking conformance to a state machine

• We want to check if this procedure conforms to this LTS

void example() { do {A: KeAcquireSpinLock(); nPacketsOld = nPackets; req = devExt->WLHV; if(req && req->status){ devExt->WLHV = req->Next;B: KeReleaseSpinLock(); irp = req->irp; if(req->status > 0){ irp->IoS.Status = SUCCESS; irp->IoS.Info = req->Status; } else { irp->IoS.Status = FAIL; irp->IoS.Info = req->Status; } SmartDevFreeBlock(req); IoCompleteRequest(irp); nPackets++; } } while(nPackets!=nPacketsOld);C: KeReleaseSpinLock(); }

STOP

SpinLock

KeAcquireSpinLock()

return

KeReleaseSpinLock()

Converting a C program to a state machine

• We can convert a C program to a state machine – The control component of the state machine will be states of the

control from graph– The data component of the state machine will be the values of the

predicates used for predicate abstraction

void example() { do {A: KeAcquireSpinLock(); nPacketsOld = nPackets; req = devExt->WLHV; if(req && req->status){ devExt->WLHV = req->Next;B: KeReleaseSpinLock(); irp = req->irp; if(req->status > 0){ irp->IoS.Status = SUCCESS; irp->IoS.Info = req->Status; } else { irp->IoS.Status = FAIL; irp->IoS.Info = req->Status; } SmartDevFreeBlock(req); IoCompleteRequest(irp); nPackets++; } } while(nPackets!=nPacketsOld);C: KeReleaseSpinLock(); }

void example() begin doA: KeAcquireSpinLock(); skip; if (*) then skip;B: KeReleaseSpinLock(); skip; if (*) then skip; else skip; fi skip; fi while (*);C: KeReleaseSpinLock(); end

C Code: State Machine (as a program):

Other than the statements labeled A,B and C, all the rest are ε-transitions

Abstraction Preserves Correctness

• The state machine that is generated with predicate abstraction is non-deterministic (the branches labeled “*” are non-deterministic choices)– Non-determinism is used to handle the cases where the

predicates used during predicate abstraction are not sufficient enough to determine which branch will be taken

• If we find no error in the generated state machine then we are sure that there are no errors in the original program– The abstract state machine allows more behaviors than the

original program due to non-determinism.– Hence, if the abstract state machine is correct then the original

program is also correct.

Counter-Example Guided Abstraction Refinement (CEGAR)

• However, if we find an error in the abstract state machine this does not mean that the original program is incorrect. – The erroneous behavior in the abstract state machine could be an

infeasible execution path that is caused by the non-determinism introduced during abstraction.

• Counter-example guided abstraction refinement is a technique used to iteratively refine the abstract state machine in order to remove the spurious counter-example traces

CEGAR

The basic idea in counter-example guided abstraction refinement is the following:

• First look for an error in the abstract program (if there are no errors, we can terminate since we know that the original program is correct)

• If there is an error in the abstract program, generate a counter-example path on the abstract program

• Check if the generated counter-example path is feasible using a theorem prover

• If the generated path is infeasible add the predicate from the branch condition where an infeasible choice is made to the predicate set and generate a new abstract program using predicate abstraction

CEGAR

void example() begin doA: KeAcquireSpinLock(); skip; if (*) then skip;B: KeReleaseSpinLock(); skip; if (*) then skip; else skip; fi skip; fi while (*);C: KeReleaseSpinLock(); end

Abstraction:

Refined Abstraction:

void example() begin doA: KeAcquireSpinLock(); b := T; if (*) then skip;B: KeReleaseSpinLock(); skip; if (*) then skip; else skip; fi b := b ? F : *; fi while (!b);C: KeReleaseSpinLock(); end

(using the predicate (nPackets = npacketsOld))

the boolean variable b represents the predicate(nPackets = npacketsOld)

CEGAR

• Using counter-example guided abstraction refinement we are iteratively creating more an more refined abstractions

• This iterative abstraction refinement loop is not guaranteed to converge for infinite domains– This is not surprising since automated verification for infinite

domains is undecidable in general• The challenge in this approach is automatically choosing the right set

of predicates for abstraction refinement– This is similar to finding a loop invariant that is strong enough to

prove the property of interest

SLAM Project

• SLAM project at Microsoft Research– Verification of C programs– Can handle unbounded recursion but does not handle

concurrency – Uses predicate abstraction and CEGAR

• SLAM toolkit was developed to find errors in windows device drivers– Predicate abstraction example in my slides is from:

• “The SLAM Toolkit”, Thomas Ball and Sriram K. Rajamani, CAV 2001

• Windows device drivers are required to interact with the windows kernel according to certain interface rules

• SLAM toolkit has an interface specification language called SLIC (Specification Language for Interface Checking) which is used for writing these interface rules (which are state machines)

• The SLAM toolkit checks if the driver code conforms to these interface specifications

272: software engineering fall 2012 instructor: tevfik bultan lecture 3: modular verification with...

Documents

procedure abstraction

procedure abstractionsthey

codethe procedure

tuple s

set of actionst s act

set of statess0 s

stop state

transition system lts