cssv: towards a realistic tool for statically detecting all buffer overflows in c
DESCRIPTION
CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C. Nurit Dor (TAU), Michael Rodeh (IBM Research Haifa), Mooly Sagiv (TAU). Greta Yorsh (TAU)?. Seminar in Program Analysis for Cyber-Security Ittay Eyal , March 2011. High-Level Structure. 2. Example. - PowerPoint PPT PresentationTRANSCRIPT
CSSV: Towards a Realistic Tool for Statically Detecting All
Buffer Overflows in CNurit Dor (TAU),
Michael Rodeh (IBM Research Haifa), Mooly Sagiv (TAU)
Greta Yorsh (TAU)?
Seminar in Program Analysis for Cyber-SecurityIttay Eyal, March 2011
High-Level Structure
2
Example
void RTC_Si_SkipLine(const INT32 NbLine, char ** const PtrEndText){ INT32 indice; for (indice=0; indice<NbLine; indice++) { **PtrEndText = ‘\n’; (*PtrEndText)++; } **PtrEndText = ‘\0’; return;}
3
Core C• Control-flow statements:
if, goto , break, or continue• Expressions are side-effect free and cannot be
nested• All assignments are statements• Declarations do not have initializations• Address-of formal variables is not allowed
4
void RTC_Si_SkipLine(const INT32 NbLine, char ** const PtrEndText){ INT32 indice; for (indice=0; indice<NbLine; indice++) { **PtrEndText = ‘\n’; (*PtrEndText)++; } **PtrEndText = ‘\0’; return;}
void SkipLine(int NbLine, char** PtrEndText) { int indice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) goto end_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ‘\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; goto begin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ‘\0’; }5
ContractsDescribe input, side-effects and output: • Requires • Modifies • Ensures
6
void SkipLine(int NbLine, char** PtrEndText) requires is_within_bounds(*PtrEndText) && *PtrEndText.alloc > NbLine && NbLine >= 0 modifies *PtrEndText *PtrEndText.is_nullt *PtrEndText.strlen
ensures *PtrEndText.is_nullt && *PtrEndText.strlen == 0 && *PtrEndText == [*PtrEndText]pre + NbLine;
void SkipLine(int NbLine, char** PtrEndText) { int indice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) goto end_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ’\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; goto begin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ’\0’; }
7
void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); }
8
Requires: is_within_bounds(*PtrEndText) && *PtrEndText.alloc > NbLine && NbLine >= 0Modifies: *PtrEndText, *PtrEndText.is_nullt, *PtrEndText.strlenEnsures: *PtrEndText.is_nullt && *PtrEndText.strlen == 0 && *PtrEndText == [*PtrEndText]pre + NbLine;void SkipLine(int NbLine, char** PtrEndText) { int indice; char* PtrEndLoc; indice=0; begin_loop: if (indice>=NbLine) goto end_loop; PtrEndLoc = *PtrEndText; *PtrEndLoc = ’\n’; *PtrEndText = PtrEndLoc + 1; indice = indice + 1; goto begin_loop; end_loop: PtrEndLoc = *PtrEndText *PtrEndLoc = ’\0’; }
void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); }
9
10
11
void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); }
void SkipLine(int NbLine, char** PtrEndText) 12
P inline(P)• Function Entry point: • Assume pre-conditions. • Store inputs ([x]pre) in temporary variables for
post-conditions check. • Return: • Set return_valueP.
• Function exit: • Assert post-conditions.
• Function call and its result assertion: • Assert pre-conditions. • Assume post-conditions (possibly w.r.t. inputs).
13
Pointer Analysis• The target – determine which objects may be
updated through a pointer. • Whole program points-to state is calculated. • Then per-procedure.
14
Pointer Analysisfoo(char *p, char *q) {
char local[100];…p = local;*q = 0;…
}
main() {char s[10], t[20], r[30]; char *temp;foo(s,t);foo(s,r);…temp = s…
}
s t r
temp
local
p q
15
Pointer Analysisfoo(char *p, char *q) {
char local[100];…p = local;*q = 0;…
}
main() {char s[10], t[20], r[30]; char *temp;foo(s,t);foo(s,r);…temp = s…
}
PARAM #1
local
p q
Parametrization for foo
PARAM #2
16
C to Integer Program
17
C2IP• Inline(P)• Pointer info
Integer Program
l.val: possible values. l.offset: w.r.t. base address. l.aSize: Allocation size. l.is_nullt: Null terminated? l.len: String length (with \0)
18
C to Integer ProgramExpression Check
19
C to Integer ProgramConstructs to Statements
20
C to Integer Program
Notation V: the number of variables and allocation sites. S: the number of C expressions.
Integer Program ComplexityO(V) constraint variables Each pointer may point to O(V) locationsTotal complexity: O(S V)
21
Integer Analysis• Calculates the inequalities that hold at each
point. • Conservative. • Each assertion is verified against the
inequalities.
22
Integer Analysis
*PtrEndText.alloc > NbLine
void main() { char buf[SIZE]; char *r, *s; r = buf; SkipLine(1,&r); fgets(r,SIZE-1,stdin); s = r + strlen(r); SkipLine(1,&s); }
23
Integer Analysis - ContractsTo optimize the contracts, do the following: 1. Assume True preconditions
Use ASPost [1] to calculate the linear inequalities at the exit point Deduce the postconditions.
2. Use AWPre to calculate backwards the most liberal preconditions.
[6] P. Cousot and N. Halbwachs. Automatic discovery of linear constraints among variables of a program. In Symp. on Princ. of Prog. Lang., 1978. 24
ImplementationC CoreC: Based on the AST-Toolkit [32]Points-to analysis: Golf [8, 9]Integer analysis: Polyhedra library [6, 19]
[6] P. Cousot and N. Halbwachs. Automatic discovery of linear constraints among variables of a program. In Symp. on Princ. of Prog. Lang., 1978.[8] M. Das. Unification-based pointer analysis with directional assignments. In SIGPLAN Conf. on Prog. Lang. Design and Impl., 2000.[9] M. Das, B. Liblit, M. F¨hndrich, and J. Rehof. Estimating the impact of scalable pointer analysis on optimization. In Static Analysis Symp., 2001.[19] B. Jeannet. New polka library. Available at“http://www.irisa.fr/prive/Bertrand.Jeannet/newpolka.html”.[32] Microsoft Research. AST-toolkit. 2002. 25
Empirical ResultsSource from two real-world projects: • String manipulation library from EADS Airbus
code. 11 procedures, 400 lines. • Part of the WEB2c converter. 8 procedures, 460
lines.
26
Empirical Results 27
Empirical Results 28
Empirical Results 29
Conclusion
• Not easy to analyze C. • Plenty of techniques and tools.
• High false positive ratio - • without hand-crafted contracts.
• Experimental results section slim. • High variance for little data. • (They had to write all contracts…)
• What would happen to normal code?
30