debugging: love it, hate it or reverse it?

31
Debugging: Love It, Hate It Or Reverse It?

Upload: undo

Post on 15-Apr-2017

62 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Debugging: Love It, Hate It or Reverse It?

Debugging:LoveIt,HateItOrReverseIt?

Page 2: Debugging: Love It, Hate It or Reverse It?

Debugging:LoveIt,HateItOrReverseIt?.

JulianSmith,co-founderandCTO,[email protected]://undo.io/

Page 3: Debugging: Love It, Hate It or Reverse It?

OverviewTesting.Debugging:

Debuggingwithgdb.Strace.Valgrind.Recordingexecution.

(Linux-specific.)

Page 4: Debugging: Love It, Hate It or Reverse It?

Testinghaschanged:

Resultingin:

Testing.

Continuousintegration.Test-drivendevelopment.Cloudtesting.

1,000softestsperhour.Manyintermittenttestfailures.Verydifficulttofixthemall.

Page 5: Debugging: Love It, Hate It or Reverse It?

Testing.Securitybreaches.Productionoutages.Unhappyusers.

Page 6: Debugging: Love It, Hate It or Reverse It?

Fixingtestfailuresishard.

Testing.

Recreatecomplexsetups:Multi-application.Networking.Multi-machine.

Re-runflakeytestsmanytimestoreproducefailureRecompile/linkwithchangeswheninvestigating.

Changesbehaviour.Slow.Requiresadevelopermachine.

Page 7: Debugging: Love It, Hate It or Reverse It?

Fixingtestfailuresisslow.

Testing

Reproducingslowfailuresis…slow.Reproducingintermittentfailuresisalsoslow.

Requiresrepeatedlyrunningatestmanytimesinordertocatchthefailure.

Criticalbugs:

Canoccuroneinathousandruns.Eachruncantakehours.

Page 8: Debugging: Love It, Hate It or Reverse It?

Toolstofixtestfailures

Testing.

Debuggers.Logging.Systemlogging.Memorycheckers.Recordingexecution.

Page 9: Debugging: Love It, Hate It or Reverse It?

GDB

Debugging.

Betterthanyoumayremember.Ctrl-XCtrl-Ashowssourcecodewithinterminalwindow.GDB-7haspythonextension.

Scripteddebugging,e.g.toreproduceintermittentfailures.

Page 10: Debugging: Love It, Hate It or Reverse It?

GDBscripting.

Debugging.

repeat_until_non_zero_exit.py'''Repeatedlyrundebuggeeuntilitfails.'''importgdb

while1:print'-'*40gdb.execute('run')e=gdb.parse_and_eval('$_exitcode')print('$_exitcodeis:%s'%e)ife!=0:break

(gdb)sourcerepeat_until_non_zero_exit.py

Page 11: Debugging: Love It, Hate It or Reverse It?

GUIsforgdbaregettingbetter:

Debugging.

Eclipse.CLion.QtCreator.KDbg.Emacs.

Page 12: Debugging: Love It, Hate It or Reverse It?

Logging.Cansometimesworkwell.Needtocontrolwhattolog.

Defineareasoffunctionalityandassigndifferentdebuglevels.E.g.parser,lexer,network.Moredetailed:memoryallocator,socket,serialiser.

Wecandefinedebuglevelsfordifferentcategoriestomatchthebugweareinvestigating.

Thiscangetcomplicated.

logcategory_t*io_category=...;logcategory_t*serialisation_category=...;...logf(io_category,"haveread%zibytesfromsocketfd=%i",n,fd);...logf(serialisation_category,"serialised%pto%zibytes",foo,actualsize);...

Page 13: Debugging: Love It, Hate It or Reverse It?

Problemswithloggingcategories.

Logging.

Howmanycategories-howdetailedshouldwego?

Dependsonthebugweareinvestigating.Mayneedtorecompilewithnewcategories.

Whatcategorydoweuseforcodethatwritesserialiseddatatoafile-io_categoryorserialisation_category?

Page 14: Debugging: Love It, Hate It or Reverse It?

Useprogrammestructureforcategories.

Logging.

Wealreadyhaveareasoffunctionality:

Sourcecodedirectories.Sourcefiles.Functions.

Wecanusetheseasimplicitcategories:

Noneedtodefineourowncategories.Wegetdifferentlevelsofcategoriesforfree.Wegetnestedcategoriesforfree.

Page 15: Debugging: Love It, Hate It or Reverse It?

Controllingverbosityprogrammatically:

Logging.

debug_add("network/socket",NULL,1);//Extraverboseforalldiagnosicsinnetwork/socket*.*.

debug_add("network/",NULL,1);debug_add("network/socket",NULL,1);//Extraverboseforalldiagnosticsinnetwork/*.*.//Evenmoreverboseinnetwork/socket*.*.

debug_add("heap/alloc.c","",1);debug_add("network/socket.c",Send,2);debug_add("parser/","",-1);//Verboseforheapoperations.//Veryverboseforalldiagnosticsinnetwork/socket.c:Send().//Lessverboseinparser/.

Page 16: Debugging: Love It, Hate It or Reverse It?

Controlverbositywithenvironmentalvariables:

Example:

Logging.

QA-friendly.Noneedtorecompile/link/build.Activateloggingindifferentpartsoftheprogrammedependingonthebugwhichisbeinginvestigated.

DEBUG_LEVELS=heap/alloc.c=1,parser/=-1,network/socket.c:Send=2myprog...

Page 17: Debugging: Love It, Hate It or Reverse It?

Strace.Linux/unix-specific.

Getadetailedlogofallsyscalls.

>stracedateexecve("/bin/date",["date"],[/*34vars*/])=0brk(0)=0xd50000access("/etc/ld.so.nohwcap",F_OK)=-1ENOENT(Nosuchfileordirectory)mmap(NULL,8192,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f7602059000access("/etc/ld.so.preload",R_OK)=-1ENOENT(Nosuchfileordirectory)open("/etc/ld.so.cache",O_RDONLY|O_CLOEXEC)=3</etc/ld.so.cache>fstat(3</etc/ld.so.cache>,{st_mode=S_IFREG|0644,st_size=144491,...})=0mmap(NULL,144491,PROT_READ,MAP_PRIVATE,3</etc/ld.so.cache>,0)=0x7f7602035000close(3</etc/ld.so.cache>)=0access("/etc/ld.so.nohwcap",F_OK)=-1ENOENT(Nosuchfileordirectory)open("/lib/x86_64-linux-gnu/libc.so.6",O_RDONLY|O_CLOEXEC)=3</lib/x86_64-linux-gnu/libc-2.19.so>read(3</lib/x86_64-linux-gnu/libc-2.19.so>,"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\34\2\0\0\0\0\0"...,832)=832fstat(3</lib/x86_64-linux-gnu/libc-2.19.so>,{st_mode=S_IFREG|0755,st_size=1738176,...})=0mmap(NULL,3844640,PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_DENYWRITE,3</lib/x86_64-linux-gnu/libc-2.19.so>,0)=0x7f7601a90000mprotect(0x7f7601c32000,2093056,PROT_NONE)=0mmap(0x7f7601e31000,24576,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE,3</lib/x86_64-linux-gnu/libc-2.19.so>,0x1a1000)=0x7f7601e31000mmap(0x7f7601e37000,14880,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,-1,0)=0x7f7601e37000close(3</lib/x86_64-linux-gnu/libc-2.19.so>)=0mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f7602034000mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f7602033000mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f7602032000arch_prctl(ARCH_SET_FS,0x7f7602033700)=0mprotect(0x7f7601e31000,16384,PROT_READ)=0mprotect(0x60e000,4096,PROT_READ)=0mprotect(0x7f760205b000,4096,PROT_READ)=0munmap(0x7f7602035000,144491)=0brk(0)=0xd50000brk(0xd71000)=0xd71000open("/usr/lib/locale/locale-archive",O_RDONLY|O_CLOEXEC)=3</usr/lib/locale/locale-archive>fstat(3</usr/lib/locale/locale-archive>,{st_mode=S_IFREG|0644,st_size=1607760,...})=0mmap(NULL,1607760,PROT_READ,MAP_PRIVATE,3</usr/lib/locale/locale-archive>,0)=0x7f7601ea9000close(3</usr/lib/locale/locale-archive>)=0open("/etc/localtime",O_RDONLY|O_CLOEXEC)=3</etc/localtime>fstat(3</etc/localtime>,{st_mode=S_IFREG|0644,st_size=3661,...})=0fstat(3</etc/localtime>,{st_mode=S_IFREG|0644,st_size=3661,...})=0mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f7602058000read(3</etc/localtime>,"TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0\7\0\0\0\0"...,4096)=3661lseek(3</etc/localtime>,-2338,SEEK_CUR)=1323read(3</etc/localtime>,"TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\10\0\0\0\10\0\0\0\0"...,4096)=2338close(3</etc/localtime>)=0munmap(0x7f7602058000,4096)=0fstat(1</dev/pts/50>,{st_mode=S_IFCHR|0620,st_rdev=makedev(136,50),...})=0mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f7602058000write(1</dev/pts/50>,"Mon26Sep12:27:50BST2016\n",29Mon26Sep12:27:50BST2016)=29close(1</dev/pts/50>)=0munmap(0x7f7602058000,4096)=0close(2</dev/pts/50>)=0exit_group(0)=?+++exitedwith0+++

Subsetofsyscalls-fileoperations:

>strace-y-etrace=filedateexecve("/bin/date",["date"],[/*34vars*/])=0access("/etc/ld.so.nohwcap",F_OK)=-1ENOENT(Nosuchfileordirectory)access("/etc/ld.so.preload",R_OK)=-1ENOENT(Nosuchfileordirectory)open("/etc/ld.so.cache",O_RDONLY|O_CLOEXEC)=3</etc/ld.so.cache>access("/etc/ld.so.nohwcap",F_OK)=-1ENOENT(Nosuchfileordirectory)open("/lib/x86_64-linux-gnu/libc.so.6",O_RDONLY|O_CLOEXEC)=3</lib/x86_64-linux-gnu/libc-2.19.so>open("/usr/lib/locale/locale-archive",O_RDONLY|O_CLOEXEC)=3</usr/lib/locale/locale-archive>open("/etc/localtime",O_RDONLY|O_CLOEXEC)=3</etc/localtime>Mon26Sep12:29:01BST2016+++exitedwith0+++

Subsetofsyscalls-memoryoperations:

>strace-y-etrace=memorydatebrk(0)=0x25b8000mmap(NULL,8192,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f14cc871000mmap(NULL,144491,PROT_READ,MAP_PRIVATE,3</etc/ld.so.cache>,0)=0x7f14cc84d000mmap(NULL,3844640,PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_DENYWRITE,3</lib/x86_64-linux-gnu/libc-2.19.so>,0)=

Page 18: Debugging: Love It, Hate It or Reverse It?

0x7f14cc2a8000mprotect(0x7f14cc44a000,2093056,PROT_NONE)=0mmap(0x7f14cc649000,24576,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE,3</lib/x86_64-linux-gnu/libc-2.19.so>,0x1a1000)=0x7f14cc649000mmap(0x7f14cc64f000,14880,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,-1,0)=0x7f14cc64f000mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f14cc84c000mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f14cc84b000mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f14cc84a000mprotect(0x7f14cc649000,16384,PROT_READ)=0mprotect(0x60e000,4096,PROT_READ)=0mprotect(0x7f14cc873000,4096,PROT_READ)=0munmap(0x7f14cc84d000,144491)=0brk(0)=0x25b8000brk(0x25d9000)=0x25d9000mmap(NULL,1607760,PROT_READ,MAP_PRIVATE,3</usr/lib/locale/locale-archive>,0)=0x7f14cc6c1000mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f14cc870000munmap(0x7f14cc870000,4096)=0mmap(NULL,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS,-1,0)=0x7f14cc870000Mon26Sep12:29:40BST2016munmap(0x7f14cc870000,4096)=0+++exitedwith0+++

Page 19: Debugging: Love It, Hate It or Reverse It?

Summary:

Strace.

Notperfect-onlyworksonsyscalllevel.Butstillusefulforlow-levelinvestigations.Norecompilationrequired.

Page 20: Debugging: Love It, Hate It or Reverse It?

Overview:

Valgrind.

Linux,OSX,Solaris,Android.Verydetailedcheckingofexecution.Free.SimilartoPurifyetc.

Page 21: Debugging: Love It, Hate It or Reverse It?

Memorychecking:

Threadchecking.

Other:

Valgrind.

Illegalmemoryaccesses:Overrun/underrunheapblocks.Overrunstack.Use-after-free.

Doublefree.Memoryleaks.

Inconsistentlockorderings.Dataraces(e.g.missingmutex).

CPUcachebehaviour.Heapprofiler.

Page 22: Debugging: Love It, Hate It or Reverse It?

Highlyrecommended!

Valgrind.

Page 23: Debugging: Love It, Hate It or Reverse It?

Newdebuggingtechnologyinlastfewyears.

Recordingexecution.

Linux:UndoLiveRecorder.RR.

Windows:Intellitrace(partialrecordingonly).TimeMachineFor.Net(partialrecordingonly).

Java:Chronon.Undo(soon).

Page 24: Debugging: Love It, Hate It or Reverse It?

LiveRecorder.Alibrary,forlinkingintoanapplication.Allowstheapplicationtocontroltherecordingofitsownexecution.ProvidesasimpleCAPItostart/save/stoprecording.APIisdefinedinundolr.hheaderfileandimplementedinlibundolrlibrary.

Page 25: Debugging: Love It, Hate It or Reverse It?

LiveRecorder.LiveRecorderrecordings:

ArestandardUndoRecordingfiles.Containeverythingneedtoreplayexecution:

Non-deterministicevents(inputstoprogram).Initialstate(initialmemoryandregisters).

Alsocontaininformationneededforsymbolicdebugging:Completeexecutableand.sofiles.Debuginfofiles.

Allowsdebuggingevenwhenlibrariesand/ordebuginformationisnotavailablelocally(e.g.loadandreplayonadifferentLinuxdistribution).LoadedintoUndoDBaswithSave-Load:

undodb-gdb--undodb-load<filename>

(undodb-gdb)undodb-load<filename>

Fullreversibledebugging.

Page 26: Debugging: Love It, Hate It or Reverse It?

LiveRecorder.LibraryAPI(undolr.h):intundolr_recording_start(undolr_error_t*o_error);

intundolr_recording_stop(void);

intundolr_recording_save(constchar*filename);

intundolr_recording_stop_and_save(constchar*filename);

intundolr_save_on_termination(constchar*filename);

intundolr_save_on_termination_cancel(void);

intundolr_event_log_size_get(long*o_bytes);

intundolr_event_log_size_set(longbytes);

intundolr_include_symbol_files(intinclude);

Page 27: Debugging: Love It, Hate It or Reverse It?

LiveRecorder.UseLiveRecorderininternaltesting:

Investigatetestfailureseasilyusingreversibledebugging.Avoidproblemswithdifferingenvironments.Noneedtoreproducecomplexmulti-machinesetups.Canbeusedindifferentways:

Disabledbydefault,butre-runfailingtestswithLiveRecorderactivated.Enabledbydefault,buttellLiveRecordertosaverecordingonlyiftestfails.

Havemultipledevelopersworkonthesametestfailure.

Page 28: Debugging: Love It, Hate It or Reverse It?

LiveRecorder.UseLiveRecorderincustomerreleases:

Nooverheadifnotused.Youandyourcustomercontrolwhen/ifrecordingisenabled.CustomerhascontroloverpruningtherecordingtoprotecttheirIP.Debuganexactcopyofacustomerfailure,withouthavingtocreateatest-case.Havemultipledevelopersworkonthesamecustomerbug.

Page 29: Debugging: Love It, Hate It or Reverse It?

LiveRecorder:Demo.

Page 30: Debugging: Love It, Hate It or Reverse It?

Questions?

LiveRecorder.

Page 31: Debugging: Love It, Hate It or Reverse It?

EOF.

http://undo.io/