discriminative model checking

69
Discriminative Model Checking Peter Niebert Doron Peled Amir Pnueli CAV 2008

Upload: garry

Post on 11-Feb-2016

69 views

Category:

Documents


0 download

DESCRIPTION

Discriminative Model Checking. Peter Niebert Doron Peled Amir Pnueli CAV 2008. Discriminative Model Checking. Peter Niebert Doron Peled Amir Pnueli CAV 2008. Warwning : inside this talk hides another talk!. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Discriminative Model Checking

Discriminative Model Checking

Peter NiebertDoron PeledAmir Pnueli

CAV 2008

Page 2: Discriminative Model Checking

Discriminative Model Checking

Peter NiebertDoron PeledAmir Pnueli

CAV 2008

Warwning: inside this talk hides another talk! Automatic Generation of Programs Using Model Checking and Genetic Programming

Gal Katz Doron Peled

Page 3: Discriminative Model Checking

Which logic to use? Linear: each execution is an alternating

sequence of states/actions. Use LTL/Buchi automata. Counterexample if property fails. Branching: a tree repsresents all

executions, including the points where they branch.

Allows expressing possibility, e.g., of services.

Page 4: Discriminative Model Checking

Linear Temporal Logic

O

U

Page 5: Discriminative Model Checking

Computation Tree Logic. . .

. . .

. . .

. . .p p

p

. . .

. . .

. . .

. . .

EG p

p p p p

p

p p

AF p

Page 6: Discriminative Model Checking

Our point of view Linear time is sufficient for specifying most

properties. A counterexample is often not enough:

Gives very little clue about the location of the error.

Does not give information about how good and bad executions are related to each other.

Thus, for analysis beyond finding the existence of an error, we promote a “deeper” search.

Page 7: Discriminative Model Checking

Our suggestion Primary or base specification in LTL, for the

base property. Analysis specification, quantifies over executions

that satisfy or do not satisfy the base specification.

Syntax:p | \/ | | | | (and others)

Semantics:- there exists a continuation satisfying the property , where holds from the beginning. - there exists a continuation not satisfying the property , where holds from the beginning.

Page 8: Discriminative Model Checking

Semantics illustrationSemantics:

- there exists a continuation satisfying the property , where holds from the beginning. - there exists a continuation not satisfying the property , where holds from the beginning.

. . .

. . .

. . .

. . .

holds

holds

Page 9: Discriminative Model Checking

Examples for specifications

Bad executions depend on infinitely many “bad choices”: ¬<>true

Before executing a, there are good and bad executions. Once a is executed, things things are persistently bad: ((¬Execa/\true)W(Execa/\false))

Properties such as “from some point all continuations are good/bad”.

Page 10: Discriminative Model Checking

How to do model checking?

We need to remember some information about the path so far to verify that with the rest of the computation it is (not) satisfying .

Suppose we would have run a Buchi automaton for , but with nondeterministic, maybe it is running on the wrong branch to be completed.

Thus, we would be running a subset construction (determinization) of the Buchi automaton.

At the point of branching, we continue with a state consistent with one of the Buchi states in the current subset.

Apply CTL* model checking to this structure.

Page 11: Discriminative Model Checking

Complexity EXSPACE-complete even for

AG true Reduction shown for related logic

mCTL*[KV LICS 2006] (this logic has different semantics, where quantification always start from the initial state).

But: EXSPACE-complete in size of LTL formula, PSPACE-complete in size of branching formula.

Page 12: Discriminative Model Checking

ApplicationWhy do we need such an analysis?

…and now we go to another lecture…

Page 13: Discriminative Model Checking

Automatic Generation of Programs Using Model Checking and Genetic Programming

Gal KatzDoron Peled

TACAS 2008

Page 14: Discriminative Model Checking

Agenda Introduction & motivation Genetic Programming Model Checking Combined method Application to mutual exclusion Conclusions & future work

Page 15: Discriminative Model Checking

Introduction Genetic programming

A methodology for automatic programming inspired by Darwinian evolution [Koza 92].

Used for automatic generation of programs in various fields.

Mostly used for optimization related problems.

Fitness is usually calculated by checking program performance against test cases.

Less used for problems with a strict specification.

Page 16: Discriminative Model Checking

Introduction (2) Model Checking

An automatic formal verification technique used mainly with finite-state software and hardware systems.

Can be used to verify communication and concurrent protocols.

Models are checked against a strict specification. The result is either:

A confirmation that the model satisfies the specification, or

A counterexample of that fact.

Page 17: Discriminative Model Checking

Introduction (3) How to construct a model from the spec.? Synthesis

Transforms spec. directly to a model that satisfies it.

Complicated. Currently not practical for automatic program

generation. Brute-force enumeration

All possible programs of a specific domain and size are generated and model-checked.

All existing solutions will eventually be found. Very time-intensive. Not practical for programs

with more than few lines of code.

Page 18: Discriminative Model Checking

Our MethodCombining GP & Model Checking

GP Engine

EnhancedModel

Checker

User1. Specification2. Configuration

3. Initial population

4. Verification results

5. New programs

6. Final Model / Results

Page 19: Discriminative Model Checking

Main Steady-state GP Algorithm

1. Create initial program population.2. Randomly choose μ programs.3. Create λ new programs by applying genetic

operations to the above μ programs.4. Calculate fitness function for μ + λ

programs, and use it to select μ new programs.

5. Replace the old μ programs by the selected ones.

6. Repeat steps 2-5 until either:a. a perfect solution is found, orb. maximum allowed number of iterations is

reached.

Page 20: Discriminative Model Checking

Program Representation Programs are

represented as trees. Internal nodes represent

expressions or instructions with parameters (assignment, while, if, block).

Terminal nodes represent constants or expressions without any parameter (0, 1, 2, me, other).

Strongly-typed GP is used [Montana 95].

while

assign=!

0 A] [A ] [ 1

me2

While (A]2[ != 0)A]me[ = 1

Page 21: Discriminative Model Checking

Initial Population Creation Population usually contains 100 – 1000

programs. Program are created recursively using the

“grow” method [KOZA 92]. The root is randomly selected from instruction

nodes. Offspring are randomly selected from allowed

node or terminals as long as rules are preserved. If max allowed tree depth is reached, a terminal

must be chosen.

Page 22: Discriminative Model Checking

Genetic Operations At each iteration of the GP

algorithm, the following genetic operations are applied to the selected programs: Reproduction – programs are copied

without any change Mutation Crossover

Page 23: Discriminative Model Checking

Mutation Operation The main operation we use. Allows performing small modifications

to an existing program by the following method: Randomly choose a program node

(internal, or leaf). According to the node type, apply one of

the following operations with respect to the chosen node (strong typing must be kept):

Page 24: Discriminative Model Checking

Replacement Mutation type (a)

Replace the sub-tree rooted by node with a new randomly generated sub-tree.

Can change a single node or an entire sub-tree.

while

assign=!

0 A] [A ] [

me2

1A] [

0

While (A]2[ != 0) A]me[ = 1While (A]2[ != 0) A]me[ = A[0]

Page 25: Discriminative Model Checking

Insertion Mutation type (b) Add an immediate

parent to the selected node.

Randomly create other offspring to the new parent, if needed.

According to the selected parent type, can cause:

Insertion of code, Wrapping code

with a while loop, Extending Boolean

expressions.

while

=!

0A ] [

2

assign

A] [ 1

me

While (A]2[ != 0) A]me[ = 1

while

=!

0A ] [

2

assign

A] [ 1

me

block

while

=!

0A ] [

2

assign

A] [ 1

me

block

assign

A] [ other

2

While (A]2[ != 0) A[2] = other A]me[ = 1

Page 26: Discriminative Model Checking

Reduction Mutation Type (c) Replace the selected node by one

of its offspring. Delete the remaining offspring of

the node. Has the opposite effect of the

previous insertion mutation, and reduces the program size.

Page 27: Discriminative Model Checking

Deletion Mutation Type (d) Delete the

sub-tree rooted by the node.

Update ancestors recursively.

assign

A] [ 1

me

while

=!

0A ] [

2

While (A]2[ != 0) A[me] = 1

empty while

=!

0A ] [

2

Page 28: Discriminative Model Checking

Crossover Operation Creates new programs by merging building

blocks of two existing programs. Crossover steps are:

Randomly choose a node from the 1st program. Randomly choose a node from the 2nd program,

that has the same type as the 1st node. Exchange between the sub-trees rooted by the

two nodes, and use the two newly created programs.

Page 29: Discriminative Model Checking

Crossover Exampleif

=!

1A ] [

me

assign

A] [ other

0

block

assign

meA ] [

2

empty while

==

A] [ other

me

A]2[ = mewhile (a[me] == other)

If (A]me[ != 1) a[0] = other

A]2[ = mea[0] = other

If (A]me[ != 1) while (a[me] == other)

Page 30: Discriminative Model Checking

Crossover (cont.) Heavily used by traditional GP [Koza]. Tries to mimic biological sexual

recombination, but Unlike biology (and unlike GA), GP lacks

the notion of “genes” [Banzhaf et al. 01]. Often acts only as a macro-mutation. Various methods were developed in order

to turn it into a more fruitful operation (Brood, Inteligent crossover).

Still, not a significant operation for small programs like those of Mutual Exclusion.

Page 31: Discriminative Model Checking

Selection At each iteration, selection is applied to all μ +

λ programs (over-production selection). Program are selected using a fitness-

proportional (roulette) method [Holland 92]. “Elitism” is used to ensure that the best

program is always selected. Similar to Evolution Strategies [Rechenberg 94]

and Brood Recombination method [Tackett 94] - better protection from harmful operations.

Page 32: Discriminative Model Checking

Model Checking

Page 33: Discriminative Model Checking

ω-automata Runs on infinite words, and consist of:

A finite alphabet Σ, A finite set of states S, A set of initial states S0 S, A transition relation Δ S x S, A labeling function L : S → ∑, An acceptance condition Ω.

In this version, the labels are on the states instead of on the arcs.

Page 34: Discriminative Model Checking

Acceptance conditions For a run p, inf(p) denotes the states

appearing infinitely on p. Buchi condition:

A set of states F S, A run p over A is accepted if inf(p) ∩ F ≠ Ø

Streett condition: A set of k pairs (Ei,Fi), 1 ≤ i ≤ k, Ei, Fi S, A run p over A is accepted if for all pairs:

inf(p) ∩ Ei ≠ Ø → inf(p) ∩ Fi ≠ Ø.

Page 35: Discriminative Model Checking

ω-automata Closure Buchi automata can be converted into

Streett automata, and vice versa. Both Buchi and Streett automata are

closed under intersection and complement.

Streett automata are less simple to use, but are closed under determinization, while Buchi automata are not.

Page 36: Discriminative Model Checking

Building Program’s State-graph Each state consists of values of variables,

program counters, buffers, etc. Edges represent atomic transitions caused by

program instructions.

Can be built by a DFS algorithm.

Can be decomposed into SCCs [Tarjan 72].

Page 37: Discriminative Model Checking

Converting Model to ω-automaton

We use the states, initial state and transitions of the program’s state-space.

Acceptance condition can allow all runs, or impose fairness conditions.

Streett automata can be used in order to define various fairness conditions (weak & strong).

Page 38: Discriminative Model Checking

Safety Properties Basic properties can be checked by

simply analyzing the state graph: Invariants – can be checked on every

visited state. Deadlocks – states without outgoing

edges. Unreachable code – instructions that

are not represented on any transition. Liveness properties require a more

complicated process.

Page 39: Discriminative Model Checking

Specification We use Linear Temporal Logic (LTL) [Pnueli

77] to define specification properties. LTL formulas are interpreted over an infinite

sequences of states, and consist of: Propositional variables, Logical connectives, such as , , , , and Temporal operators, such as:

(p) – p will eventually occur. (p) – p always occurs.

A model M satisfies a formula φ (M╞ φ) if every (fair) run of M satisfies φ.

Page 40: Discriminative Model Checking

Converting specification to ω-automaton

Every LTL property can be converted into a Buchi automaton with a size exponential to the LTL formula size [Vardi & Wolper 94].

For deterministic Streett automata, a determinization process is also required [Safra 88].

May result in a doubly exponential blowup from LTL property.

Page 41: Discriminative Model Checking

The Model Checking Process [Vardi & Wolper 86]

Both model and speciation are converted to ω-automata over the same alphabet.

The alphabet is 2AP, where AP denotes a set of atomic propositions that may hold on the system states.

Every word accepted by M (a fair run) should be accepted by the spec, therefore we have to check whether: L(M) L(φ(.

Page 42: Discriminative Model Checking

Model Checking Results It’s easier to check whether:

L(M) ∩ L(φ( = Ø, or L(M) ∩ L(φ( = Ø.

Case 1: Intersection is empty. M satisfies φ .

Case 2: Intersection is not empty. Runs contained in the

intersection can be used for generating counterexamples.

L(M)

L(L(φφ))

L(M)

L(L(φφ))

Page 43: Discriminative Model Checking

Checking for Non-Emptiness

Easy with Buchi automata: Decompose intersection graph into maximal SCCs

reachable from the root. Check ff an accepting state from F occurs infinitely often

inside a reachable SCC. More

complicated with Streett automata.

Alg. can be used for a single SCC or an entire automaton:

Page 44: Discriminative Model Checking

Model Checking and GP Can standard model checking results be used as a

GP fitness function? Yes, but it was done so far with a limited success

[Johnson 07]. A fitness function with just two values is a poor one. We wish to analyze the model checking graph in

order to quantify the level of satisfaction. When using nondeterministic Buchi automata, a

single program computation may have multiple accepting and non-accepting paths difficult to analyze.

Deterministic Streett automata are not more expensive, but ensure symmetry between accepting and non-accepting paths.

Page 45: Discriminative Model Checking

Enhanced Model Checking Algorithm

The idea: We assume that an hostile scheduler (or environment)

chooses the execution path. For each spec. property, we check the amount of work

the scheduler has to make in order to cause a property violation.

The results are used for setting the fitness level & scores.

Page 46: Discriminative Model Checking

Fitness Level 0 All SCCs are

empty (not accepting).

Property is never satisfied.

No scheduler choices are needed.

A

ED

CB

Empty SCC

Accepting SCC

Page 47: Discriminative Model Checking

Fitness Level 1 At least one accepting SCC. At least one empty bottom

SCC. Finite number of scheduler

choices can lead the execution into the empty BSCC (D in the example).

The program will stay there forever.

BSCC with only 1 node means a deadlock gets worse score.

A

ED

CB

Empty SCC

Accepting SCC

Page 48: Discriminative Model Checking

Fitness Level 2 All BSCCs are

accepting. At least one empty

SCC. Infinite scheduler

choices are needed for keeping the program inside the empty SCC (B in the example).

A

ED

CB

Empty SCC

Accepting SCC

Page 49: Discriminative Model Checking

Fitness Level 3 All SCCs are accepting. There still may be SCCs

that are not universal, and contains violating paths.

Therefore, the graph universality is checked.

If the graph is not universal, we are still at level 2.

Otherwise, level 3 is assigned.

In this case, even infinite scheduler choices cannot cause a violation, since the property is always satisfied.

A

ED

CB3

2

1B

Empty SCC

Accepting SCC

Page 50: Discriminative Model Checking

Overall Fitness Function Fitness levels & scores are calculated for each

specification property. How to merge into a single fitness function? Naïve summing can bias the results, since some

properties may be trivially satisfied when more basic properties are violated.

Thus, spec. properties are divided into levels, starting from level 1 for most basic properties.

As long as not all properties at level i are satisfied, properties at higher level gets fitness of 0.

This algorithm also saves running time by skipping unneeded checks.

Page 51: Discriminative Model Checking

Parsimony GP programs tend to grow up over time to the maximal

allowed tree size (“bloating”). Large portions of the code become “introns” (junk DNA). To avoid that, we use parsimony as a secondary fitness

measure. Number of program nodes * small factor is subtracted

from the fitness score. The factor should be carefully chosen.

Should encourage programs to reduce their size, but Should not harm the evolutionary process.

Therefore, programs cannot get a score of 100, but only get close to it. The run can be stopped when all properties are satisfied.

Programs can be reduces either by mutations, or directly by detecting dead code by the model checking process, and then removing it.

Page 52: Discriminative Model Checking

“Vacuity” A special care is needed for implication

properties of the form (p q). Some (or all) executions may be

vacuously satisfied if p never happens. We are usually interested only on runs

when p eventually occurs. Other runs are neither good nor bad.

They are irrelevant. Thus, in these cases, the program

automata is first intersected with the property p.

Some SCC might be marked irrelevant.

(p (p q)q)

pppq

ppq

p

If all SCCs are irrelevant, fitness level 0 is assigned. A similar mechanism is used for excluding unfair runs.

(p (p q)q)

Page 53: Discriminative Model Checking

The Mutual Exclusion Problem Originally described by [Dijkstra 65]. Many variants and solutions exist. Modeled using the following program parts:

Non Critical Section Pre Protocol Critical Section Post Protocol

We wish to automatically generate correct code for the pre and post protocol parts.

Page 54: Discriminative Model Checking

Spec. Properties The specification includes the following LTL

properties:

The properties are converted into Streett automata.

Page 55: Discriminative Model Checking

Runs Configuration 3 different sets of runs:

The following parameters were used: Population size: 150 Max number of iterations: 2000 μ: 5 λ: 150

Page 56: Discriminative Model Checking

An Example of a Run (1st variant)

Randomly created. Does not satisfy mutual exclusion

property. Higher level properties are set to 0.

Score: 0.0

Page 57: Discriminative Model Checking

An Example of a Run (1st variant)

Randomly created. While loop guarantees mutual exclusion. Only process 0 can enter the critical

section.

Score: 66.77

Page 58: Discriminative Model Checking

An Example of a Run (1st variant)

Last line changed by a mutation. The naïve mutual exclusion algorithm. Processes uses a “turn” flag, but depend on each

other. A local maximum point in the search space.

Score: 75.77

Page 59: Discriminative Model Checking

An Example of a Run (1st variant)

An important building block common to many algorithms. Each process set its own flag and wait for other’s flag, but The flag is not turned off correctly. Might eventually deadlock, thus, properties 4 and 5 get

fitness level of 1.

Score: 70.17

Page 60: Discriminative Model Checking

An Example of a Run (1st variant)

Last line is replaced by a mutation. Now, process 0 correctly turns its flag off. Property 5 is fully satisfied

Score: 76.10

Page 61: Discriminative Model Checking

An Example of a Run (1st variant)

A single node is changed by a mutation. Both processes turn off their flag. Properties 4 and 5 are fully satisfied. Still, deadlock occurs if both processes enter

simultaneously.

Score: 92.77

Page 62: Discriminative Model Checking

An Example of a Run (1st variant)

A mutation added a line to the empty while loop. This turns the deadlock into a live lock, and causes

a slight fitness improvement.

Score: 93.20

Page 63: Discriminative Model Checking

An Example of a Run (1st variant)

Another line is added to the while loop. No more dead or live locks, but property can still

be violated by some infinite scheduler choices.

Score: 94.37

Page 64: Discriminative Model Checking

An Example of a Run (1st variant)

Created by some random mutations. All properties are satisfied. Still, not the shortest solution.

Score: 96.50

Page 65: Discriminative Model Checking

An Example of a Run (1st variant)

Created by more mutations. The shortest found algorithm. Identical to the known “One bit protocol”

[Burns & Lynch 93].

Score: 97.10

Page 66: Discriminative Model Checking

Fitness Graph Best fitness is alternately improved by:

Major leaps due to changes in fitness levels. Small improvements caused by parsimony pressure.

Page 67: Discriminative Model Checking

More experiments Successfully found Dekker's

algorithm. [Dijkstra 65]. Successfully found Peterson’s

algorithm. [Peterson & Fisher 77]. Found a shorter algorithm than

Dekker's.

Page 68: Discriminative Model Checking

Performance

First variant was easiest to solve. Other variants are much harder to find. Still, much better than brute-force

methods. Less significant on small programs

(Peterson). Crucial on large programs (Dekker).

Page 69: Discriminative Model Checking

Conclusions and Future Work

GP and model checking were successfully combined. To achieve that, a specific tool was developed.

Found solutions are guaranteed to completely satisfy the specification.

Scoring system can be further refined. More information can be extracted from the model

checking results, for assisting the evolutionary process.

A similar method can be used for correcting a given program, or at least showing where the error is.

Next step: use discriminative model checking properties to refine grading and to find where in program to make changes.