the yogi project software property checking via verification and testing

The Yogi ProjectSoftware property checking via verification and testing

Aditya V. Nori, Sriram K. Rajamani

Programming Languages and ToolsMicrosoft Research India

The plan

Part 1: Our philosophy, the basic algorithm, related work coffee break

Part 2: Handling programs with pointers and procedures coffee break

Part 3: The engineering

Download: http://research.microsoft.com/yogi

Thanks to our collaborators

Aws Albarghouthi Nels Beckman Patrice Godefroid Bhargav Gulavani Tom Henzinger

Yamini Kannan Rahul Kumar Robert Simmons Sai Deep Tetali Aditya Thakur

Property checking

void f(int *p, int *q){0: *p = 4;1: *q = 5;2: if ()3: error();}

QuestionIs the error unreachable for all possible inputs?

More generally, we are interested in the query

Possible solution: Testing

The “old-fashioned” and practical method of validating software

Generate test inputs and see if we can find a test that violates the assertion

What’s wrong with testing?

If we view testing as a “black-box” activity, Dijkstra is right!After executing many tests, we still don’t know if there is another test that can violatethe assertion

If we view testing as a “white-box” activity, and “observe” what happens inside the program, we can do interesting things:A. For a buggy program, we can generate

tests in a directed manner to find the bug

B. For a correct program, we can prove that the assertion holds for all inputs!

Our hypothesis

Tests: presence of bugs

void foo(int a){0: i = 0;1: c = 0;2: while (i < 1000) {3: c = c + i;4: i = i + 1;

}5: if (a <= 0)6: error();}

𝑎↦−5

Tests: presence of bugs𝑎↦−5

void foo(int a){0: i = 0;1: c = 0;2: while (i < 1000) {3: c = c + i;4: i = i + 1;

}5: if (a <= 0)6: error();}

Proofs: absence of bugs

void foo(int y1, int y2){0: state = 1;1: if (y1) {2: x0 = x0 + 1;

} else {3: x0 = x0 – 1;

}4: if (y2) {5: x1 = x1 + 1;

} else {6: x1 = x1 – 1;

}7: if(state != 1)8: error();}

Exponential number of tests required!

assume(y1)

1state = 1∃𝒔1

∃𝒔 2

overapproximation

} else {3: x0 = x0 – 1;

}4: if (y2) {5: x1 = x1 + 1;

} else {6: x1 = x1 – 1;

8:error

state = 1assume(y1) assume(!y1)x0=x0+1 x0=x0-1assume(y2) assume(!y2)

x1=x1+1 x1=x1-1assume(state != 1)9

assume(state == 1)

} else {3: x0 = x0 – 1;

}4: if (y2) {5: x1 = x1 + 1;

} else {6: x1 = x1 – 1;

linear proof exists!

0:state=11:state=1

2:state=1

8:error

3:state=14:state=1

5:state=1 6:state=17:state=1

} else {3: x0 = x0 – 1;

}4: if (y2) {5: x1 = x1 + 1;

} else {6: x1 = x1 – 1;

Algorithm uses only test case generation operations

Maintains two data structures:▪ A forest of reachable concrete states (tests)▪ Under-approximates executions of the program

Yogi: Proofs from Tests

… … …

Algorithm uses only test case generation operations

Maintains two data structures:▪ A forest of reachable concrete states (tests)▪ Under-approximates executions of the program

▪ A region graph (an abstraction)▪ Over-approximates all executions of the program

8:error

state = 1

assume(y1) assume(!y1)

x0=x0+1 x0=x0-1

assume(y2) assume(!y2)

x1=x1+1 x1=x1-1

assume(state != 1)9

assume(state == 1)

Algorithm uses only test case generation operations Maintains two data structures:▪ A forest of reachable concrete states (tests)▪ Under-approximates executions of the program

▪ A region graph (an abstraction)▪ Over-approximates all executions of the program

Our goal: bug finding and proving▪ If a test reaches an error, we have found bug▪ If we refine the abstraction so that there is *no* path from

the initial region to error region, we have a proof Key ideas▪ Frontier: Boundary between tested regions and untested

regions▪ uses only aliases α that are present along concrete tests

that are executed

Key ideas

Frontier = edge that crosses from tested to untested region

Step 1: Try to generate a test that crosses the frontier

Perform symbolic simulation on the path until the frontier and generate a constraint

Conjoin with the condition

needed to cross frontier Is satisfiable?

frontier

Key ideas

Conjoin with the condition needed to cross frontier

Is satisfiable? [YES]

Step 2: run the test and extend the frontier

frontier

Key ideas

Conjoin with the condition needed to cross frontier

Is satisfiable? [NO]

Step 2: use to refine so that the frontier moves back!

frontier

The Yogi algorithm

Can extend test beyond frontier?

Refine abstraction

Construct initial abstraction

Construct random tests

Test succeeded?

Abstraction succeeded?

τ = error path in abstraction f = frontier of error path

Proof!

Input:Program Property

The Yogi algorithm

Refine abstraction

Test succeeded?

Proof!

void f(int y){0: int lock=0, x;1: do {2: lock = 1;3: x = y;4: if (*) {5: lock = 0;6: y = y+1; }7: } while (x != y)8: if (lock != 1)9: error();10:}

0 1234

𝑦=1

)fronti

Symbolic execution +

Theorem proving

Symbolic execution

ylockxsymbol table

𝜏=(0,1,2,3,4,7,8,9)

Symbolic execution

ylockxsymbol table

𝜏=(0,1,2,3,4,7,8,9)

Symbolic execution

ylockxsymbol table

𝜏=(0,1,2,3,4,7,8,9)

Symbolic execution

ylockxsymbol table

𝜏=(0,1,2,3,4,7,8,9)

Symbolic execution

ylockxsymbol table

𝜏=(0,1,2,3,4,7,8,9)

constraints

Symbolic execution

ylockxsymbol table

𝜏=(0,1,2,3,4,7,8,9)

constraints

The Yogi algorithm

Refine abstraction

Test succeeded?

Proof!

0 1234

frontier

Refinement

8:¬ρ 8:ρ9

refine

𝑊𝑃 ¿

0 1234

assume(lock != 1)

Candidates for refinement predicate

A. Strongest postcondition (SP) along the path up to the frontier

B. Weakest precondition (WP): C. Any formula separating the two above

(e.g., an interpolant)

Refinement0 1

8: 8:p

8:¬ρ 8:ρ9

refine

𝜌=( 𝑙𝑜𝑐𝑘 !=1)

assume(lock != 1)

Next iteration

Refine abstraction

Test succeeded?

Abstractionsucceeded?

Proof! yes

0 1234

8: 8:p

𝜏=(0,1,2,3,4,7 ,<8 ,𝑝>,9) frontier

Correct, the program is …

34⋀¬s5⋀¬s6⋀¬r

7⋀¬q8⋀¬p

4⋀s5⋀s6⋀r7⋀q

Soundness and Termination

• Theorem: Suppose we run Yogi on any program and property – If Yogi returns , then the abstract

program simulates , and thus is a proof that does not reach

– If Yogi returns , then is an error trace

prog. P’prog. P

SLIC rule

SLAM and CEGAR

boolean program

pathpredicates

newton

CEGAR (SLAM) on an example

void foo(int a){0: i = 0;1: c = 0;2: while (i < 1000) {3: c = c + i;4: i = i + 1; }5: if(a <= 0)6: error();}

CEGAR will need to refine the loop 1000 times!

Yogi on the same example

𝜏=(0,1,2 ,(3,4,2)1000 ,5,6)

Symbolic execution produces the constraint

frontier

void foo(int a){0: i = 0;1: c = 0;2: while (i < 1000) {3: c = c + i;4: i = i + 1; }5: if(a <= 0)6: error();}

Directed testing with DARTvoid test_me(int x, int y){1: int z = 2*x;2: if (z==y) {3: if (y == x+10)4: error(); }5: return; }

Step 1:• Try random test:

DART: Directed Automated Random Testing

Patrice Godefroid, Nils Klarlund, Koushik Sen, PLDI 2005

Directed testing with DART

Step 1:• Try random test:

Step 2:• Do a “parallel” symbolic

execution• Collect constraints and negate

the last branch • Solve for the constraint: • Try next test

void test_me(int x, int y){1: int z = 2*x;2: if (z==y) {3: if (y == x+10)4: error(); }5: return; }

Directed testing with DART

Step 1:• Try test:

Step 2:• Do a “parallel” symbolic

execution• Collect constraints and negate

the last branch • Solve for constraint: • Try next test • Find error!

void test_me(int x, int y){1: int z = 2*x;2: if (z==y) {3: if (y == x+10)4: abort(); }5: return; }

Path explosion with DARTvoid foo(int x1, int x2,…, int xn){ lock = 1 if (x1) g = g + 1; else g = g – 1; if (x2) g = g + 1; else g = g – 1; ... if (xn) g = g + 1; else g = g – 1; if (lock != 1) error();}

• Full path coverage expensive

• Yogi uses abstraction to ignore irrelevant states and efficiently computes a proof

Partition refinement, Bisimulations and the Lee-Yannakakis algorithm

Init Error

Bisimulations

• Partition is stable if for every region we have that doesn’t split any other region in a non-trivial manner

• Bisimulation = coarsest stable partition, precise abstraction for property checking

Init Error

Between regions

Between concrete

states

Partition refinement

Partition refinement:• For every region compute

and split every partition as needed

• Until no splits can be generated

𝜎 1 𝜎 2𝜎 3

Partition refinement

• On termination generates a bisimulation:

• For every pair of regions and we have that iff there is some state and some state such that

𝜎 1 𝑝𝑟𝑒 (𝜎 ¿¿3)¿ 𝜎 3On termination Yogi generates a simulation:For every pair of regions and , we have if there is some state and some state such that then

Partition refinement:• For every region compute

and split every partition as needed

• Until no splits can be generated

Lee-Yannakakis 92 andPasaraneu-Palanek-Visser 05

• Compute reachable bisimulation quotient

• Do splitting only for reachable portion of the state space

• Generate witnesses for reachability of regions and split only regions which have reachable witnesses

• On termination reachable partitions are a bisimulation:

• For every pair of regions and we have that iff there is some state and some state such that

ErrorInit

Can split

Can’t split

Comparison with Yogi

void foo(int y){0: while(y>0){1: y = y -1; }2: if(false)3: error();}

• Yogi terminates with this abstraction shown

• In contrast, Lee-Yannakakis will refine regions and with predicates forever

• Reachable bisumulation quotient for this program is infinite

• But bisimulation quotient is not necessary to prove the property. Simulation is sufficient, and Yogi is able to find such a simulation

Summary so far …

Yogi combines testing and verification Maintains a partition (region graph) just like SLAM Generates witnesses using test case generation like DART Core operation:▪ If frontier can be extended by a test case, extend▪ If frontier cannot be extended, refine the pre-state region

Combines advantages of SLAM and DART Uses regions to avoid exploring exponential number of paths (like

SLAM) Uses tests to explore complicated code (like DART)

Computes simulations that can do proofs

Terminates in more cases than algorithms that compute bisimulation quotients, such as Lee-Yannakakis

the yogi project software property checking via verification and testing

test inputs

possible inputs

testing aditya

buggy program

correct program

violatethe assertion

c i4

blackbox activity

Documents

formal verification, model checking

verification of real time operating system with model...

11. structure checking and verification

data checking and verification randomization and...

integrating software testing and run-time checking in an...

data verification for collective adaptive systems: spatial...

computer aided verification - usf › ~haozheng ›...

formal verification of multitasking applications...

esc java. static analysis spectrum power cost type checking...

preventing software vulnerabilities by static verification...

formal verification at ibm · 2013-01-29 · rulebase...

checking safety properties using compositional...

lecture 5 model checking cs6133 software specification and...

fachgebiet rechnersysteme 6. model checking · model...

formal verification of systems-on-chip – industrial …...

digitaalsüsteemide verifitseerimise kursus1 formal...

ece 667 - synthesis & verification - l271 ece 697b (667)...

finite-state verification or model...

digitaalsüsteemide verifitseerimise kursus1 formal...

checking properties of software static safety verification...