klee: unassisted and automac generaon of high‐coverage tests...

Post on 08-Aug-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

KLEE:UnassistedandAutoma2cGenera2onofHigh‐Coverage

TestsforComplexSystemsPrograms

Cris2anCadar,DanielDunbar,DawsonEnglerStanfordUniversity

PresentedbyAdamBergsteinNovember28,2011

Outline•  Background

–  Symbolicexecu2on–  Constraintsandsolvers–  Sinks/sinksources–  Abstractdomainandconcre2za2on–  Systemmodeling

•  KLEE–  Mainconcepts–  Overallprocess–  PrecisionfromLLVMandbytecode–  No2onofstates–  Constraintsandpaths–  PerformanceandEnvironment–  Results

•  MyThoughts•  Ques2ons

Background•  Symbolicexecu2on–  Simula2onthatapproximatesvariablevaluesbyusingsymbols

–  Opera2onsonvariablesconstrainthesymbols–  Usedtoreasonaboutpossiblevaluesthatcausecertaincondi2onsinaprogram•  Isasymbolicvalueintherangeofvaluesthatcausesomethingtooccur?

–  hXp://www.stat.uga.edu/stat_files/billard/tr_symbolic.pdf•  Constraintsandsolvers–  Constraintsarecollectedfactsaboutaprogramthatdefineboundsonpossibleexecu2onatspecificpointsinaprogram

–  Solversdeterminethepossibilityofconcretevaluesbasedontheconstraints

–  Certainconcretevaluescancondi2onallycauseprogramstobehaveinundesirableways

Background

•  Sinksandsinksources–  Sinksiden2fymeaningfulopera2onswithinthecode–  Sourcesiden2fythedataoriginsthatcaninfluencesinks

•  Abstractdomainandconcre2za2on–  Definingtherangeofallpossiblevaluesforvariables–  Concre2za2onmapsactualvariablevaluesfromrangesofpossiblevalues

•  Systemmodeling–  “Approxima2ng”howasystembehaveswhenitruns– Wehavelookedatdifferentwaystorepresentsystems,likeCFGs,summaryfunc2ons,etc

KLEE>MainConcepts•  Useofsta2canalysistodetermineiftherearepossible

concretevaluesthatcausevulnerabili2esintheprogram•  Simulateaprogramandleveragesymbolicexecu2on•  Buildconstraintsandmaintainaseriesofstatesthroughoutthe

simula2on–  Statesdefineeachuniquepaththroughouttheprogram

•  Leverageasolvertodeterminepossibili2eswithintheprogrambasedonconstraints–  Returnconcretevaluesifsomethingwassolvable

•  Documentareasofthecodethathaveanypossiblevaluesthatcancausevulnerabili2es–  Basedonasetofpossibledangerousopera2ons

•  “Basedontheconstraints(stateofuniquepath)atthe2meIgettothislineofcodewithapoten2allydangerousopera2on,isthereanypossiblevaluethatcancausethislineofcodetobedangerous?”

KLEE>MainConcepts•  KLEEbeginsbyconstruc2ngunconstrainedvariablesforargumentsinto

state–  Ini2alconstraintsaresetbasedon‐‐sym‐argswhenrunningKLEE–  Definesnumberofargumentsandnumberofcharactersperargument–  Setsini2alconstraintssoopera2onisnottotallyunbounded

•  Analysissimulateseachinstruc2onandrunseachstateperinstruc2on–  Schedulingalgorithmtoselectwhichstatetoanalyzefirst–  Collectmoreconstraints,updatethesymbolicvaluesinthestate–  Whenreachingapoten2alopera2onthatcontainsanexitorerror,lookat

thepathcondi4on•  Pathcondi2onsarethecollec2onofconstraintsthatarevalidforthat

specificpath–  Apathcondi2onisuniqueforeachstatesinceapathcaninfluencethe

symbolicvaluesonapathbypathbasis–  Onabranchstatement,astateisclonedforpossiblepaths–  Thepathcondi2onisupdatedperstate,tomimicuniquepaths

•  Determiningmaliciousconcretevaluesareboundedbythepathcondi2on–  ThesearesenttoSTPsolver–  Isthereapossiblesetofvaluesthatcancauseanissue?

KLEE>OverallProcess•  CompileprogramintobytecodewithLLVM•  RunKLEEwithdefinednumberofargumentsandini2alcharacter

boundconstraintsofarguments–  Assistswithabstractdomaintomakeitbounded

•  Simulatetheprogram,symbolicexecu2on–  Collectconstraintsonvariables,updatestate

•  Forbranches,determinewhatispossiblebasedonconstraints–  Passconstraintstosolvertoseewhatbranchispossible–  Clonestateforallpossiblebranches,updatepathcondi2onsineach

state–  Similartomay/mustanalysis

•  Forpoten2aldangerousopera2ons,iden2fyanyconcretevaluesthatcausedangerousopera2ons–  Passconstraintstosolver–  Returnanypossiblevaluesthatcancauseundesiredresults

•  Usefulforboundschecking,pointerdereferencing,asser2ons

KLEE>PrecisionfromLLVMbytecode

•  Theconstraintsareveryprecisebecausethebytecoderepresentsbit‐levelaccuracy

•  Thisreducestheapproxima2onusedinmodelingtherunningapplica2on

•  Thisprecisionmakesthesolvermoreeffec2veindeterminingpossiblevalues

KLEE>No2onofStates

•  Eachstaterepresentsoneuniquepathintheprogramatagivenpointinrun2me

•  Needtomaintainsymbolicvaluesbystateatthegiveninstruc2on

•  Maintainsregisterfile,stack,heap,programcounter–  Instruc2onpointerismaintainedbyKLEE

•  Maintainconstraintsofthepathcondi2onsforusewithinthesolver–  Statesmaybeac2veorinac2veforagiveninstruc2onbasedonpathcondi2onandconstraints

KLEE>ConstraintsandPaths

•  Thegoalistofindconcretevaluesthatcausedangerousopera2ons

•  Forthesolvertobeeffec2veinfindingconcretevalues,theabstractdomainneedstobereduced

•  Pathcondi2onssetconstraintsonvariablevaluesofthespecificpath–  i<0,j==10,etc

•  Symbolicvaluescreatesitsownconstraintsonvariables–  i=(2xi)+10–  j=j2

•  Thecombina2onofsymbolicvaluesandpathcondi2onssetboundsforthesolvertodeterminepossiblevaluesbasedonstateforagiveninstruc2on

KLEE>PerformanceandEnvironment

•  Twoofthebiggestchallengeswereperformanceandmodelingopera2onsinvolvingtheenvironment

•  Thenumberofstatescangrowrapidly–  Tocombatit,KLEEusesasharedmemorymappingbetweenstates

•  Useofcompiler‐liketrickstomakeproblemseasierforthesolver

•  EnvironmentcallsaremodeledbyCcode,toreflecttherun2mestate–  UseofuClibctomimicsystemcalls–  KLEEdevelopershavesetupothercustommodelstoreflectopera2onsinvolvingtheenvironment

KLEE>Results

•  Lookedatpackageswhichsupportedcommoncommand‐lineprogramslikelsandtr

•  Averageof90%codecoverage•  HighlighteddifferencesbetweeninCoreU2lsandBusybox– Simulatedthesamecommandsandfounddifferencesbetweenthetwopackages

•  FounderrorsinbothCoreU2lsandBusybox,respec2vely

DifferencesbetweenCoreU2lsandBusybox

MyThoughts

•  Therearealotofsimilari2esfromwhatwehavediscussedinclass–  PHPpaperusedsinksandsinksourceswithquerystatements–  Thispaperlooksforopera2onslikepointers,asser2ons,prinl,

andload/stores–  Symbolicexecu2onlikethePHPpaper–  May/mustanalysisforlookingatpoten2alpaths–  Constraintsanduseofasolver

•  Constraintsdefinedbysymbolicanalysisandpaths–  Canbeconsideredcontextandflowsensi2ve

•  Createsnewstatesbasedonpathbranches•  Simulatesfunc2oncallsperstatebasedonthecurrentstatevalues

–  Concre2za2onbasedonsymbolicvaluesandpathcondi2ons

MyThoughts•  Therearesomedifferencesbetweentheapproaches– Nomen2onofacontrolflowgraph,purelyasimula2ontool

–  Theirgoalisonlytofindconcretevaluesbasedonstates,sotherearenomeetorjoinopera2ons•  Theyarelookingatspecificstatesandderivingconcretevaluesthataredangerous

•  Theyarenotapproxima2ngsystemfunc2onality

– Othersta2canalysisusedapproxima2onbecauseprecisionisexpensive•  Iamcurioushowlargethetestedapplica2onswere•  Authorsclaimthatthecodewascomplicatedbutmyassump2onisthattherewasnotalotofcode

Ques2ons

WhichUniversityhastheHardTimesCaféshowntothelem?

top related