static detection of buffer overrun in c code
DESCRIPTION
Static Detection of Buffer Overrun In C Code. Lucas Silacci CSE 231 Spring 2000 University of California San Diego. Introduction. “A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities” David Wagner Jeffery S. Foster Eric Brewer Alexander Aiken - PowerPoint PPT PresentationTRANSCRIPT
Static Detection of Buffer Overrun In C Code
Lucas SilacciCSE 231Spring 2000University of CaliforniaSan Diego
Introduction
“A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities”
David WagnerJeffery S. FosterEric BrewerAlexander AikenUniversity of California, Berkeley
Topics of Discussion
The Buffer Overrun ProblemStatic Analysis through ConstraintsExperience with the toolPerformanceLimitations of the toolConclusionsFuture Directions
Buffer Overrun
The problem? Lots of unsafe legacy C code! The fingerd attack in 1988 is a prime
exampleC is inherently unsafeArray and pointer references are not
automatically bounds-checkedstandard C library is unsafe
Standard C Library is Unsafe
Inconsistencies in the library strncpy(dst, src, sizeof(dst)) is correct strncat(dst, src, sizeof(dst) is incorrect
Encouragement of one-off errors strncat(dst, src, sizeof(dst) - strlen(dst) -
1) is correct But -1 is often overlooked
CERT Advisories
As much as 50% of CERT-reported vulnerabilities are buffer overrun related
Static Analysis through Constraints
Why static analysis? Runtime testing may miss problems in code
paths not followed in ordinary execution Opportunity to eliminate problems proactively
Fundamental Ideas C Strings treated as an abstract data type Buffers modeled as pairs of integer ranges
Constraint Language
alloc(str):set of possible number of bytes allocated
for string strlen(str):
set of possible lengths of string strsafety condition:
alloc(str) <= len(str)
Safety Condition cont.
For two ranges:alloc(str) = [a, b]; len(str) = [c, d]
b <= c str never overflows its buffer
a > dstr always overflows its buffer
ranges overlapan overflow cannot be ruled out
Constraint Generation
Generate an integer range constraint for each line of C code
Constraints take the form of X Y where X, Y are range variables
examples: char dst[n]; n alloc(dst) sprintf(dst, “%s”, src); len(src)
len(dst) fgets(str, n, ...); [1, n] len(str)
Constraint Generation Example
Source Code:
char buf[128];
while (fgets(buf, 128, stdin)) {
if (!strchr(buf, ‘\n’)) {
char error[128];
sprintf(error, “Line too long: %s\n”, buf);
die(error);
}
...
}
The Focus is on primitive string operations!
Constraints:
[128, 128] alloc(buf)[1, 128] len(buf)
[128, 128] alloc(error)
len(buf) + 16 len(error)
Constraint Solver
Efficient algorithm for finding a bounding box solution to a system of constraints gives bounds on ranges of variables, but
can’t give any info on relationship between them
flow-insensitive analysis sacrifices precision for scalability,
efficiency and ease of implementation
Experience: Linux nettools
The tool found buffer overrun problems that were previously undiscovered in a manual audit in 1996: a library blindly trusting the length
returned by DNS lookups several unchecked strcpy()’s that could
cause buffer overrun by spoofing a routine blindly copying the result of
getnetbyname() into a fixed-size buffer
Experience: Sendmail 8.7.5
Run on an older version of Sendmail to compare against problems found by hand auditing
Found a number of possible buffer overrun errors that were fixed in later versions (8.7.6 & 8.8.6)
Performance
Static Analysis Time performance is “sub-optimal but usable” 15 minutes on a fast Pentium III for
Sendmail (32k lines of C code)Greatly overshadowed by time required
to examine all warnings by handscalability is in question as they “have
no experience with very large applications”
Limitations: Correctness
false alarms 44 Probable warnings generated for Sendmail
8.9.3 with only 4 being actual one-off bugs For comparison, there were 695 call sites to
potentially unsafe string operationsReduce these by adding flow-sensitive or
context-sensitive analysis - performance degradation + fewer false alarms means less user
intervention
Flow-Insensitive Example
strcpy is not really reached unless it is safe:
if (sizeof(dst) < strlen(src) + 1)
break;
strcpy(dst, src);
Incorrectly flagged as a possible overrun since the analysis is flow-insensitive!
Limitations: Completeness
false negatives pointer aliasing and primitive pointer
operations are ignoredA known Sendmail 8.7.5 overrun bug was
missed due to this
But of 10 known fixed overrun Sendmail 8.7.5 bugs, tool missed only that one
How do you know you missed an error?
Pointer Aliasing Example
A 13-byte string is copied into the 10-byte buffer t:
char s[20], *p, t[10];
strcpy(s, “Hello”);
p = s + 5;
strcpy(p, “ world!”);
strcpy(t, s);
This is not caught due to pointer aliasing
Conclusions
Useful for review of legacy code gives pointers to reviewers of areas to
concentrate onAn improvement of 15X over grep
(Sendmail 8.9.3)
Found some previously undocumented buffer overrun vulnerabilities in “reviewed” code (Linux nettools)
Conclusions (cont.)
Static checking of code before deployment lacks performance degradation of most
run-time checkers program verification systems typically
require programmers to annotate code currently requires much manual
intervention to sort out real problems from false alarms
Future Directions
Addition of flow-sensitive analysis Expected removal of ~48% of false alarms
Addition of flow- and context-sensitive analysis with linear invariants and pointer analysis Expected removal of ~95% of false alarms
Both would have some obvious performance impact