ddec: data-driven equivalence checking

24
DDEC: Data-Driven Equivalence Checking Rahul Sharma, Eric Schkufza, Berkeley Churchill, Alex Aiken

Upload: aure

Post on 23-Feb-2016

174 views

Category:

Documents


1 download

DESCRIPTION

Rahul Sharma, Eric Schkufza , Berkeley Churchill, Alex Aiken. DDEC: Data-Driven Equivalence Checking. Equivalence checking. Prove two programs are equivalent Compiler optimizations Validate refactorings Cross checking different implementations Old and well studied problem - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DDEC: Data-Driven Equivalence Checking

DDEC: Data-Driven Equivalence Checking Rahul Sharma, Eric Schkufza, Berkeley Churchill, Alex Aiken

Page 2: DDEC: Data-Driven Equivalence Checking

Equivalence checking

Prove two programs are equivalent Compiler optimizations Validate refactorings Cross checking different implementations

Old and well studied problem Undecidable in general Major challenge: prove equivalence of

loops Straight line programs relatively easy

Page 3: DDEC: Data-Driven Equivalence Checking

Motivating applications

Prove equivalence of two binaries

…while ……

Trustworthy Compiler

CompCert, gcc –O0

Optimizing Compiler

gcc –O3, icc –O3

Confidence of , Performance of

Page 4: DDEC: Data-Driven Equivalence Checking

Stochastic superoptimization

StraightLineCode

Trustworthy Compiler

CompCert, gcc –O0

STOKE (ASPLOS 13)

Random mutations

…while ……

Page 5: DDEC: Data-Driven Equivalence Checking

Previous work

Do not support “while” loops: [CHR00], [FH02], [FH05], [AEF+05], [SBC+05], [MSF06]

Do not reason about termination: [SDE+08], [GS09], [RE11], [LHM+12], [PY13], [LMS+13]

Translation validation: [Nec00],[GZB05], … Need information from the compiler

Page 6: DDEC: Data-Driven Equivalence Checking

Simulation relation

Decompose proofTarget movq 8(rsp), rdi#rdi != 0

movq 8(rsp), rdidecq rdimovq rdi, 8(rsp)

retq

movq 8(rsp), r9

#r9 != 0

decq r9 retq

: states equal

aa’

b b’

c c’

: live out equal: 8(rsp)=rdi=r9’

Rewrite

Page 7: DDEC: Data-Driven Equivalence Checking

InferenceGiven a simulation relation, proofs for loops

reduce to proofs for loop free fragments Use decision procedures

Main challenge: infer a simulation relation Infer synchronization points Infer invariants

We use compilers as black boxes

Mine relations from concrete executions

Page 8: DDEC: Data-Driven Equivalence Checking

Runtime information

Run some tests to get data From executions, unit tests, random

tests, etc.

Page 9: DDEC: Data-Driven Equivalence Checking

Runtime information

Ensure the loops iterate for equal iterations Use data to align and Target

B

retq

B’

retq

Rewrite 2n n

B;B

n

Page 10: DDEC: Data-Driven Equivalence Checking

Runtime information

Attempt to detect synchronization points Number of times program points are

executed Values alignTarget

movq 8(rsp), rdi#rdi != 0

movq 8(rsp), rdidecq rdimovq rdi, 8(rsp)

retq

movq 8(rsp), r9

#r9 != 0

decq r9 retq

Rewrite n

1 n

n+1

n+1

n

Page 11: DDEC: Data-Driven Equivalence Checking

Invariants

Invariants are restricted to equalities Infer invariants from observed data

values8(rsp) rdi

2 2

1 1

0 0

Target movq 8(rsp), rdi#rdi != 0

movq 8(rsp), rdidecq rdimovq rdi, 8(rsp)

retq

Page 12: DDEC: Data-Driven Equivalence Checking

Invariants

Invariants are restricted to equalities Infer invariants from observed data

values 8(rsp) rdi r9’

2 2 2

1 1 1

0 0 0

movq 8(rsp), r9

#r9 != 0

decq r9 retq

Rewrite

Page 13: DDEC: Data-Driven Equivalence Checking

Linear algebra

Mine all equalities

Find all s.t. Nullspace or kernel

𝐼≡8 (𝑟𝑠𝑝 )=𝑟𝑑𝑖∧𝑟𝑑𝑖=𝑟 9 ′

𝐼 ′≡4𝑒𝑎𝑥=𝑒𝑑𝑥 ′+3∧10𝑒𝑎𝑥+𝑒𝑑𝑥=𝑒𝑐𝑥 ′

8(rsp) rdi r9’

2 2 2

1 1 1

0 0 0

𝐴≡

Page 14: DDEC: Data-Driven Equivalence Checking

Check simulation relation The executions are synchronized The invariants are maintained

Target movq 8(rsp), rdi#rdi != 0

movq 8(rsp), rdidecq rdimovq rdi, 8(rsp)

retq

movq 8(rsp), r9

#r9 != 0

decq r9 retq

aa’

b b’

c c’

Rewrite

8 (𝑟𝑠𝑝 )=𝑟𝑑𝑖∧𝑟𝑑𝑖=𝑟 9 ′

States equal

Live outs equal

Page 15: DDEC: Data-Driven Equivalence Checking

Check simulation relation The executions are synchronized The invariants are maintained Queries in quantifier free bitvector arithmetic

Complete SMT solvers! Incorporate counter-examples in relations

Sound but not complete If checking succeeds then equivalent Can fail to infer a sound simulation relation

Page 16: DDEC: Data-Driven Equivalence Checking

Limitations

Insufficient data to infer a sound relation

Expressiveness of invariants Inequalities, quantifiers, etc.

Expressiveness of SMT solver Floating point, multiply, divide, etc.

Page 17: DDEC: Data-Driven Equivalence Checking

Implementation

Run tests and generate data https://github.com/eschkufz/x64asm

Nullspace computation libIML: integer matrix library

SMT solver: Z3

Page 18: DDEC: Data-Driven Equivalence Checking

Case studies

Compute kernel inside OpenSSL

Validating CompCert against gcc

Stochastic optimization for loops

Page 19: DDEC: Data-Driven Equivalence Checking

OpenSSL

Multiplication kernel

Extensive performance tests Run the kernel ~15 million times Choose 16 random tests for inference

Compile with gcc –O0 and gcc –O3 Successfully prove equivalence

Page 20: DDEC: Data-Driven Equivalence Checking

Cross compiler validation

Page 21: DDEC: Data-Driven Equivalence Checking

STOKE

Page 22: DDEC: Data-Driven Equivalence Checking

Optimization resultsProgram Stoke vs gcc -O0 Stoke vs gcc –O3Bansal 1.58X 1.04XSAXPY 9.22X 1.48X

Page 23: DDEC: Data-Driven Equivalence Checking

Conclusion

Prove equivalence of loops in two stages Infer simulation relation Check the inferred relation using SMT solvers

Use runtime data for inference

No change required to the compilers

Better verifiers lead to better optimizers

Page 24: DDEC: Data-Driven Equivalence Checking

Inference from concrete states M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco,

M. S. Tschantz, and C. Xiao. The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(1-3):35–45, 2007

T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to discover polynomial and array invariants. ICSE 2012

P. Garg, C. Löding, P. Madhusudan, D. Neider: Learning Universally Quantified Invariants of Linear Data Structures. CAV 2013

R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, A. V. Nori: A Data Driven Approach for Algebraic Loop Invariants. ESOP 2013

R. Sharma, S. Gupta, B. Hariharan, A. Aiken, A. V. Nori: Verification as Learning Geometric Concepts. SAS 2013

A.V. Nori, R. Sharma: Termination proofs from tests. ESEC/SIGSOFT FSE 2013