“tracking pointers with path and context sensitivity for bug detection in c programs” cmsc 838z...

29
“Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented by Mujtaba Ali (Based on a presentation given at ACM SIGSOFT FSE-11, September 2003)

Post on 20-Dec-2015

226 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

“Tracking Pointers with Path and Context Sensitivity for Bug Detection in C

Programs” CMSC 838Z – Spring 2004

V. Benjamin Livshits and Monica S. Lam

presented by Mujtaba Ali

(Based on a presentation given at ACM SIGSOFT FSE-11, September 2003)

Page 2: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

2

Bugs are Bad

• Costs lots of money to fix after deployment

• Nastiest bugs == security violations– Hard to discover effects of malicious attacks– Legal/liability issues

Page 3: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

3

CVE Classification

Buff er over fl ow

Mal f or med input

Shel l Metachar acter s

For mat str ings

Bad per missions

Syml ink Fol lowing

P r ivi lege Handl ing

Cr oss-si te scr ipting

Cr yptogr aphic er r or

Dir ector y T r aver sal

62%

Would like to address these

• Study of security reports in 2002– source: SecurityFocus.com

Page 4: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

4

Motivation

• Goal: Report hard-to-find security violations– Errors spanning many functions and files– Reduce false positives

• Applications:– Buffer overflows– Format string vulnerabilities

Page 5: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

5

Examples

• Two security vulnerabilities:– Buffer overrun in gzip, compression utility– Format string violation in muh, network game

• Cause: Unsafe use of user-supplied data– gzip copies data to statically-sized buffer

• may result in an overrun

– muh uses data as format argument call to vsnprintf

• user can maliciously embed %n into format string

Page 6: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

6

Buffer Overrun in gzip0592     while (optind < argc) {0593         treat_file(argv[optind++]);0594 }

0704 local void treat_file(char *iname){...

0716     if (get_istat(iname, &istat) != OK) return;

0997 local int get_istat(char *iname, struct stat *sbuf){

...1009     strcpy(ifname, iname);

gzip.c:593

gzip.c:1009

gzip.c:716

Need a model of strcpy

0233 char ifname[MAX_PATH_LEN]; /*input file name*/gzip.c:233

Page 7: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

7

Format String Violation in muh

0838             s = ( char * )malloc( 1024 );0839             while( fgets( s, 1023, messagelog ) ) {0841                irc_notice(&c_client, status.nickname, 

s);0842             }0843             FREESTRING( s );

257 void irc_notice(con_type *con, char nick[], 

char *format, ... ){259     va_list va;260     char buffer[ BUFFERSIZE ];261 262     va_start( va, format );263     vsnprintf( buffer, BUFFERSIZE - 10, format, va );

muh.c:839

irc.c:263

Page 8: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

8

Easy Bugs are Boring

• Programs have security violations despite code reviews and years of use

• Common observation about hard errors:– Errors on interface boundaries – need to follow

data flow between procedures– Errors occur along complicated control-flow

paths: need to follow long definition-use chains

Page 9: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

9

Need for Alias Analysis

• Both examples involved complex flow of data

• Tracking data flow in modern languages requires alias analysis

• Steensgaard’s or Andersen’s analysis?– Flow- and context- insensitive– Fast, but imprecise: too many false positives– But flow- and context- sensitive analyses do not

scale

Page 10: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

10

Tradeoff: Scalability vs Precision

slow and expensive

Speed / Scalability

Pre

cisi

on

low

high

Steensgaard

3-valuelogic

This analysis

Andersen

Wilson &Lam

fast

Page 11: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

11

A Hybrid Analysis

• Analyze precisely:– Local variables

– Function parameters

– Global variables

– …their dereferences and fields

– These are essentially access paths, i.e. p.next.data.

• The rest: Break into equivalence classes

• Represent by abstract locations:– Recursive data

structures

– Arrays

– Locations accessed through pointer arithmetic

• Maintain precision selectively

Page 12: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

12

Linearity

• Regular assignments result in strong updates

• Assignments to abstract memory locations – weak updates

x = 1; x0 = 1 x = 2; x1 = 2y = x; y0 = x1

x is 2

A[i] = 1; m0 = 1 A[j] = 2; m1 = (m0, 2)b = A[k]; b0 = m1

Either 1 or 2

Page 13: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

13

Static Single Assignment

• Sparse representation of program

• Propagate facts about definition where needed– Definition-use relationships

• Each variable is defined only once– Give new names (subscripts) to definitions

– Solves flow-sensitivity problem

• But a definition may be used many times

Page 14: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

14

SSA: Join points

• In standard SSA, use function at joins– d3 = (d1, d2)

• In Gated SSA, use function at joins– d3 = ( <P, d1>, <¬P, d2>)– Solves path-sensitivity problem

Page 15: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

15

IPSSA – Intraprocedurally

• Extension of Gated SSA

• Provides pointer resolution– Replace indirect pointer dereferences with direct

accesses of potentially new temporary locations

Page 16: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

16

Example of Pointer Resolution

Load resolution

Store resolution

int a=0,b=1;int c=2,d=3;

if(Q){    p = &a;}else{    p = &b;}

c  = *p;

*p = d;

p1 = &a

p2 = &b

p3 = (<Q, p1>, <¬Q, p2>)

c1 = (<Q, a0>, <¬Q, b0>)

a1 = (<Q, d0>, <¬Q, a0>)

b1 = (<Q, b0>, <¬Q, d0>)

a0 = 0, b0 = 1c0 = 2, d0 = 3

Page 17: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

17

Pointer Resolution Rules

• When resolving definition d, next step depends on RHS of d

• Expressed as conditional rewrite rules

• A few sample rules:– d = &x, result is x– d = (…), result is d^

– d = (<P1, d1>,…,<Pn, dn>), follow d1…dn

• Refer to the paper for details

Page 18: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

18

Interprocedural Example• Data flow in and out of functions:

– Create links between formal and actual parameters– Reflect stores and assignments to globals at the callee

int f(int* p){ *p = 100;} 

int main(){      int x = 0;      int *q = &x;        c: f(q);    }

p0 = (<c,q0>)p^1 = 100

x1 = ρ(<f,100>)

q0 = &x

Formal-actual connection for call site c

Reflect store inside of f

within main

x0 = 0

Page 19: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

19

Unsound Unaliasing Assumption

A1: No aliased parameters

A2: No aliased abstract locations

Assumption

Locations accessible through different parameters are distinct

Things pulled out of an abstract location is not aliased

JustificationMatches how good interfaces are written

Holds in most usage cases

Consequence

Context-independent procedure summaries

Give unique names when we get data from abstract location

Page 20: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

20

Interprocedural Algorithm• Process one SCC of the call graph at a time

– Bottom-up

– SCC == strongly-connected component

• For each SCC, within each procedure:– Resolve all pointer operations (loads and stores)

– Create links between formal and actual parameters

– Reflect stores and assignments to globals at call sites

• Iterate within SCC until the representation stabilizes

Page 21: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

21

Framework

Program sources

IPSSA construction

Buffer overrun

sError traces

Format violation

s…others…

Abstracts away many details.

Makes it easy to write tools

IP data flow info

Framework makes it easy to

add new analyses

Page 22: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

22

Application1. Start at roots – sources of user input such as

– argv[] elements– Input functions: fgets, gets, recv, getenv,

etc.2. Follow data flow chains provided by IPSSA: for

every definition, IPSSA provides a list of its uses3. A sink is a potentially dangerous usage such as

– A buffer of a statically defined length– A format argument of vulnerable functions:

printf, fprintf, snprintf, vsnprintf4. Report bug, record full path

Page 23: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

23

Experimental Setup

• Implementation w/ SUIF2 compiler suite– Pentium IV 2GHz, 2GB of RAM running Linux

Program Version LOC Procedures IPSSA constr.time, seconds

lhttpd 0.1 888 21 5.2 polymorph 0.4.0 1,015 19 1.0 bftpd 1.0.11 2,946 47 3.2 trollftpd 1.26 3,584 48 11.3 man 1.5h1 4,139 83 29.3 pgp4pine 1.76 4,804 69 17.5 cfingerd 1.4.3 5,094 66 15.5 muh 2.05d 5,695 95 20.4 gzip 1.2.4 8,162 93 17.0 pcre 3.9 13,037 47 22.4

Page 24: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

24

Summary of Results

Program Total Buffer Format False Defs Procs Tool'sname # of over- string positives spanned spanned runtime

warnings runs vulner. sec

lhttpd 1 1 0 0 24 14 99polymorph 2 2 0 0 7,8 3,3 2.4bftpd 2 1 1 0 5, 7 1, 3 2.3 strollftpd 1 1 0 0 23 5 8.5 sman 1 1 0 0 6 4 9.6 spgp4pine 4 4 0 0 5, 5, 5, 5 3, 3, 3, 3 27.1 scfingerd 1 0 1 0 10 4 7.4 smuh 1 0 1 0 7 3 7.5 sgzip 1 1 0 0 7 5 2.0 spcre 1 0 0 1 6 4 9.2 sTotal 15 11 3 1 Previously unknown: 6

Many definitions

Many procedures

Page 25: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

25

False Positive in pcre

• Copying “tainted” user data to a statically-sized buffer may be unsafe– Turns out to be safe in this case

sprintf(buffer, “%.512s”, filename)

Limits the length of copied data.

Buffer is big enough!

Tainted data

Page 26: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

26

Related Work

• xgcc– Also unsound– Many more false positives– Even “regular” developers can specify new

analyses

• Cqual– Interprocedural, sound analysis– Requires annotations– Flow-, context-, and path-insensitive

• More false positives

Page 27: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

27

The Good

• Unsoundness gives very few false positives

• Unsoundness allows efficient path- and context-sensitive analysis

• Tested on real code

• Good presentations to steal slides from

Page 28: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

28

The Bad

• Cqual is much improved now– Very few false positives on these types of security

vulnerabilities– Let’s see some more interesting vulnerabilities

• Does not detect buffer overflows due to array bounds violations

• Not tested with large programs

Page 29: “Tracking Pointers with Path and Context Sensitivity for Bug Detection in C Programs” CMSC 838Z – Spring 2004 V. Benjamin Livshits and Monica S. Lam presented

29

Singular Key Idea

• IPSSA coupled with a (slightly) unsound alias analysis facilitates efficient detection of hard-to-find security violations with very few false positives