code2inv: learning loop invariants for program verificationxsi/data/code2inv_poster.pdf · 2019....

1
Code2Inv: Learning Loop Invariants for Program Verification Xujie Si*, Hanjun Dai*, Mukund Raghothaman, Mayur Naik, Le Song (*equal contribution) University of Pennsylvania & Georgia Institute of Technology Contact: {xsi,rmukund,mhnaik}@cis.upenn.edu; {hanjun.dai, lsong}@cc.gatech.edu Problem Statement Experiments Background SSA attention for a == b attention for c >= -1 && n >= 1 Visualizing attention over code Successfully generalizes to new programs Higher performance even without training Attention and CE improve performance With 1 confounding variable With 5 confounding variables Verification cost Sample complexity Collected 133 benchmark programs Compared our framework Code2Inv with 3 SOTA approaches Code and data: https://github.com/PL-ML/code2inv Counter-example Attention #Solved Instances Max #Z3 Queries Max #Parameter Updates 91 415K 441K 94 147K 162K 95 392 337K 106 276 290K LSTM + CE + Attention 93 32 661K Challenges How to represent programs so that a neural policy can be learned? How to predict loop invariant based on the neural representation? How to learn a neural policy from an automated theorem prover? Representing programs using ASTs and Dataflow Loop Invariant Prediction Framework Loop invariant Program embedding Partial loop invariant Logical operator Predicate AST Multi-step decision making process Why is this interesting? o Proving {} requires deep logical reasoning o Finding is a fundamental problem in program verification o Traditionally, ad hoc features are used to find o Challenge problem in Artificial Intelligence 1. How to find in order to prove {}? 2. Given a set of programs ~ that are sampled from an unknown distribution , can we learn from them and generalize the learned strategy to unseen programs? “If holds before executing , then holds afterwards.” “Loop Invariant” Program Verification How to prove {} ? E.g. for the last program, =<0⋁>0 “Program testing can be used to show the presence of bugs, but never to show their absence!” — Edsger W. Dijkstra

Upload: others

Post on 23-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Code2Inv: Learning Loop Invariants for Program Verificationxsi/data/code2inv_poster.pdf · 2019. 11. 24. · Code2Inv: Learning Loop Invariants for Program Verification Xujie Si*,

Code2Inv: Learning Loop Invariants for Program VerificationXujie Si*, Hanjun Dai*, Mukund Raghothaman, Mayur Naik, Le Song (*equal contribution)

University of Pennsylvania & Georgia Institute of Technology Contact: {xsi,rmukund,mhnaik}@cis.upenn.edu; {hanjun.dai, lsong}@cc.gatech.edu

Problem Statement

ExperimentsBackground

SSA

attention for a == b attention for c >= -1 && n >= 1

Visualizing attention over code

Successfully generalizes to new programs

Higher performance even without training

Attention and CE improve performance

With 1 confounding variable With 5 confounding variables

Verification cost Sample complexity

Collected 133 benchmark programs

Compared our framework Code2Inv with 3 SOTA approaches

Code and data: https://github.com/PL-ML/code2inv

Counter-example Attention #Solved Instances Max #Z3 Queries Max #Parameter Updates

✘ ✘ 91 415K 441K

✘ ✔ 94 147K 162K

✔ ✘ 95 392 337K

✔ ✔ 106 276 290K

LSTM + CE + Attention 93 32 661K

Challenges

How to represent programs so that a neural policy can be learned?

How to predict loop invariant based on the neural representation?

How to learn a neural policy from an automated theorem prover?

Representing programs using ASTs and Dataflow

Loop Invariant Prediction Framework

Loop invariant

Programembedding

Partial loop invariant

Logicaloperator

Predicate AST

Multi-step decision making process

Why is this interesting?

o Proving 𝑃 𝑆 {𝑄} requires deep logical reasoning

o Finding 𝑰 is a fundamental problem in program verification

o Traditionally, ad hoc features are used to find 𝑰

o Challenge problem in Artificial Intelligence

1. How to find 𝑰 in order to prove 𝑃 𝒘𝒉𝒊𝒍𝒆 𝐵 𝒅𝒐 𝑆 {𝑄}?

2. Given a set of programs 𝑆𝑖 ~ 𝒫 that are sampled from an unknown distribution 𝒫, can we learn from them and generalize the learned strategy to unseen programs?

“If 𝑃 holds before executing 𝑆, then 𝑄 holds afterwards.”

“Loop Invariant”

Program Verification

How to prove 𝑃 𝑆 {𝑄} ?

E.g. for the last program,𝑰 = 𝑥 < 0 ⋁ 𝑦 > 0

“Program testing can be used to show the presence of bugs, but never to show their absence!”

— Edsger W. Dijkstra