sciences usc information institute pedro c. diniz university of southern california / information...
TRANSCRIPT
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Pedro C. Diniz
University of Southern California / Information Sciences Institute4676 Admiralty Way, Suite 1001Marina del Rey, California 90292
Increasing the Accuracy of Shape and Safety Analysis for Pointer-based Codes*
* This work is partly funded by the National Science Foundation (NSF) under award number CCR-0209228.
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Introduction and Motivation Static Shape Analysis
• Understand Topological Properties of Data Structures Tree DAG Graph Topology Induced by a Subset of the Pointer Fields
Focus:• C Codes that Allocate Memory via malloc/free Functions• Traverse and Change Data Structure through Pointers
Applications: • Redundant Load/Store Elimination• Instruction Scheduling• Parallelization• “Bug” Finding
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Basic Approach
Loss of Accuracy• Need for Summarization• Abstract After Each Statement• Ignores Control Flow Predicates• Ignores Node “Configurations”
stat
cond
stat stat
Abstract Interpretation
Execute Each StatementMaterialize & Abstract
Fix-Point
Invariants
{next.prev, prev.next}
Abstract Storage Graph (ASG)
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
ExampleObservations:1. Loop Does not Modify Data Structure2. “Scan” structure a long “next”3. If body executes (stats 5&6) then on
exit (t != NULL)4. AND (p == t->next) holds5. On exit a few “contexts” hold6. This loop is “safe”, i.e. no null pointer
is ever dereferenced.7. The loop terminates:
¨ Iff structure is acyclic along “next” if it terminates from stat 2
¨ Only sufficient condition if it terminates from stat 4.
1: t = NULL;
2: while(p != NULL){
3: if (p->data < item)
4: break;
5: t = p;
6: p = p->next;
7: }
t == NULL
p == NULL
Context #1
t == NULL
p != NULL
p->data < item
Context #2 t != NULL
p != NULL
p->data < item
p = t->next
p == p->next(k)
Context #3 t != NULL
p == NULL
p->data < item
p = t->next
p == p->next(k)
Context #4
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Example
1:if(p->next != NULL){
2: p->next->prev = temp;
3: temp->next = p->next;
4: p->next = temp;
5: temp->prev = p;
6:}
Observations:1. Modification for a node s.t. p-
>next != NULL2. Need to know relation between
temp and p
ASG
pp
temp temp
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
What’s the Point?
Programmers Fundamentally Encode ”State" via Conditionals and Loop Constructs
A Typical Programming Style is to Use• Loop constructs to scan the structures to position pointer
variables at nodes that should be modified. • Conditional statements to define which operations should be
performed.
Shape Analysis and Safety Algorithms should Exploit the Information Conveyed in these Statements.
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Basic Analysis
Structural Fields & Node Configurations
Scan Loops
Assumed/Verified Properties
Context Tracing
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Scan Loops Typical Scan loops are short!
• Read-Only Heap Pointer Values• Use Stack/Global Variables
Symbolic Pointer Analysis• Symbolically Execute Loop Statements for
For zero-trip Multi-trips Relationships between Pointers on Exits
• Symbolical Value Number (iteration-based) Across All Loop Internal Paths
• Reach Closed-form Expressions (see HN90) Convert Loop into a Multi-way Statement
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Scan Loop and Tracing Contexts
03: t = NULL;04: while(p != NULL){05: if(p->data < item)06: break;07: t = p;08: p = p->next; 09: }
C0 = {t -> t(0), p -> p(0)}
C1 = {t -> NULL, p -> p(0)}
T (i,i+1) = {
t(i+1)-> p(i),
p(i+1) -> p(i)->next
p(i+1) == t(i+1)->next
p(i) != NULL;
t(i+1) != NULL;
}
T(0,i+1) = {
t -> t(i+1);
p-> p(i+1);
t(i+1) = p(0)(->next)i
p(i+1) = p(0)(->next)i+1
p(i+1) = t(i+1)->next
p(i) != NULL;
t(i+1) != NULL;
}
t = NULL
p != NULL
p->data < item
t = p;p = p->next;
C2 , C3
C4 , C5 Symbolic Loop Transfer Function
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Contexts On Exit of Scan Loop
C2 = {
t-> NULL, p -> p(0)
t = NULL
p = NULL
}
C3 = {
t->t(i+1), p->p(i+1)
p(i+1) = NULL
t(i+1) = p(0)(->next)i
p(i+1) = p(0)(->next)i+1
p(i+1) = t(i+1)->next
t(i+1) != NULL
}
C5 = {
t->t(i+1), p->p(i+1)
p(i+1) != NULL
t(i+1) = p(0)(->next)i
p(i+1) = p(0)(->next)i
p(i+1) = t(i+1)->next
t(i+1) != NULL
}
C4 = {
t-> NULL; p ->p(0)
t = NULL
p != NULL
}
Zer
o-T
rip
Mul
ti-T
ripExit #1 Exit #2
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Why Are Contexts Important?
Establish Symbolic Pointer/Values Relationships• Allow Analyses to Discriminate Between “Nodes” of an
Abstract Shape Representation for Increased Accuracy• Identify Potential Non-Trivial “Bugs”
09: if(t != NULL){10: stat; // with p = t->next11: }
09: if(p != t->next){10: t->next = NULL;11: }
03: t = NULL;04: while(p != NULL){05: if(p->data < item)06: break;07: t = p;08: p = p->next; 09: }
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Termination
Derive Sufficient Termination Conditions• Look at Loop Transfer Function(s)• Exit Predicates
T (i,i+1) = {
t(i+1)-> p(i),
p(i+1) -> p(i)->next
}
Predicates:p != NULLp->data < item
Acyclic(next) = TRUE)
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Safety (non-Nil Dereferencing)
Examine Contexts• Check out if Predicates Ensure Dereference• If Not Can Derive (Min) Predicates that Can
t = NULL
p != NULL
p->data < item
t = p;p = p->next;
C2 , C3
C4 , C5
{p(i) != NULL }
{p(i) != NULL }
{t(i+1) != NULL;
p(i+1) ? }
p->next != NULL ?
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
How Frequent are Scan Loops?
Program Lines #Loops #PtrLoops #ScanLoops
bintree 200 5 1 1
em3d 148 11 6 5
hash 96 7 3 3
blocks2 560 45 17 6
chomp 298 24 10 4
sparse 1170 89 76 56
graphics 686 28 14 4
paraffins 166 17 8 2
nbody 808 13 11 2
pug 2958 77 39 26
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Putting the Pieces Together
Coarse-Grain
Shape Analysis
GH:POPL96
Scan Loop
Termination & Safety
Context Tracing
Fine-Grain
Shape Analysis
Use Results from Coarse-Grain Analysis
Abstract Storage Graph (ASG)
PropertiesHold
YES
NO
AssumedProperties
Use Results from
Fine-Grain Shape Analysis
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Related Work Shape Analysis
• LH88:PLDI88, CWZ90:PLDI90• PKC93:LCPC93• Deutsh94:PLDI94• SRW98:TOPLAS98,POPL99• HHN94:IPPS94,HHN94:PLDI94, GH96:POPL96• CAZ:LCPC01• KR:POPL02
(Static) Safety Analysis• Colby97:LoyolaUnivTechRep97• Evans96:PLDI96• DRS98:PASTE98
Program Checking• NL98:PLDI98• Ball:PLDI01
SCIENCESSCIENCES
USCUSCINFORMATIONINFORMATION
INSTITUTEINSTITUTE
Summary
Symbolic Analyses• Structural Fields and Node Configurations• Scan Loops• Assumed and Verified Properties for Termination• Context Tracing for Accurate Pointer Relationships
Thesis:
In order to increase the accuracy of shape and safety analysis algorithms, compilers must
uncover and exploit the knowledge encoded in conditional statements