the s2e platform · 2018. 7. 12. · vitaly chipounov • ph.d. in 2014 at epfl, switzerland dslab,...

57
The S2E Platform Finding vulnerabilities in Linux and Windows programs Vitaly Chipounov https://s2e.systems Ready-for-use docker image, demos, tutorials, source code, documentation

Upload: others

Post on 25-Jan-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

  • The S2E PlatformFinding vulnerabilities in

    Linux and Windows programs

    Vitaly Chipounov

    https://s2e.systemsReady-for-use docker image, demos,
tutorials, source code, documentation

  • Vitaly Chipounov

    • Ph.D. in 2014 at EPFL, Switzerland 
DSLAB, George Candea

    • S2E started back in 2008 when we wanted to reverse engineer Windows device drivers using symbolic execution

    • Realized that we could also use this to find bugs in drivers, do performance analysis, etc. => created the S2E platform


    • Co-founder and chief architect at Cyberhaven • Finalists in the Cyber Grand Challenge

  • • Evaluate software for vulnerabilities (Attack) • Defend software against attacks (Defend) • Keep software running and available (Availability)

    The World's First All-Machine 
Hacking Tournament

    Two teams used S2E

    Team CodeJitsu

    Cyberhaven
UC Berkeley


    Syracuse University

    Team Disekt

  • Tutorial Overview (Part 1)

    • Overview of S2E and symbolic execution

    • Getting started with S2E

    • Design and Implementation

    • Automated generation of proofs of vulnerability

    Finding vulnerabilities in user-space apps with S2E

  • Tutorial Overview (Part 2)

    • Dealing with path explosion
>12 KLOC driver, large input files

    • Parallel symbolic execution, concolic execution, fuzzer integration, symbolic fault injection, function models, etc.

    Testing Windows device drivers with S2EFAT12/16/32 driver from the WDK (fastfat.sys)

  • S2E is a platform for in-vivo 
multi-path analysis of software systems

    Bug finding Verification

    Testing Security checking

    Extensible Write your own tools

    Symbolic execution Concolic execution

    State merging Fuzzing

    On real OSes, with real apps, libraries, drivers

    Pretty much anything that runs on computers

  • void main(int argc, char **argv) { int r = 1, i = 1;

    if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } }

    Dynamic Symbolic Execution

    ✓ ✓

  • void main(int argc, char **argv) { int r = 1, i = 1;

    if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } }

    30 GB disk 4 GB RAM?

    Dynamic Symbolic Execution

  • 30 GB disk 4 GB RAM?

    void main(int argc, char **argv) { int r = 1, i = 1;

    if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } }

    Dynamic Symbolic Execution

  • #include

    int main(int argc, char **argv) { FILE *fp = fopen(argv[1], "rb"); if (!fp) { goto err; }

    int data; if (fread(&data, sizeof(data), 1, fp) != 1) { goto err; }

    if (data == 0xdeadbeef) { printf("Hello world\n"); } else { *(char *)0xbadcafe = 0; }

    err: if (fp) { fclose(fp); }

    return 0; }

  • Quick Startssh -CX [email protected]

    $ wget -O crash.c https://pastebin.com/raw/B1BbapMV $ gcc -g -O0 —m32 -o ./crash ./crash.c $ source s2e/s2e_activate $ s2e new_project ./crash @@ $ cd s2e/projects/crash $ ./launch-s2e.sh …

    mailto:[email protected]://pastebin.com/raw/B1BbapMV

  • Analysis Output$ ls -1 s2e/projects/crash/s2e-last/ assembly.ll debug.txt ExecutionTracer.dat info.txt module.bc run.stats tbcoverage-0.json tbcoverage-1.json testcase-crash:0x50f:0x804858a-0-_tmp_input testcase-kill-0-_tmp_input testcase-kill-1-_tmp_input warnings.txt

  • Reproducing crashes with concrete test cases

    ~/s2e/projects/crash$ gdb --args ./crash s2e-last/testcase-crash:0x50f:0x804858a-0-_tmp_input

    (gdb) r Starting program: /home/vitaly/s2e/projects/crash/crash s2e-last/testcase-crash:0x50f:0x804858a-0-_tmp_input

    Program received signal SIGSEGV, Segmentation fault. 0x0804858a in main (argc=2, argv=0xffffcec4) at crash.c:18 18 *(char *)0xbadcafe = 0; (gdb)

  • Profiling Path Explosion$ s2e forkprofile crash … 01295 crash:0x08048571 1 /home/ubuntu/s2e/env/crash.c:15 (main)

    if (data == 0xdeadbeef) { printf("Hello world\n"); } else { *(char *)0xbadcafe = 0; }

    More on that in the 2nd part of the tutorial…

  • Displaying Code Coverage

    $ s2e coverage lcov --html crash

    Overall coverage rate: lines......: 84.6% (11 of 13 lines) functions..: no data found


    Line coverage saved to
 /home/vitaly/s2e/env/projects/crash/s2e-last/crash.info.

An HTML report is available in 
 /home/vitaly/s2e/env/projects/crash/s2e-last/crash_lcov

  • $ s2e coverage lcov --html crash

  • OverviewFinding vulnerabilities in user-space apps with S2E

    • Overview of S2E and symbolic execution

    • Getting started with S2E

    • Design and Implementation

    • Automated generation of proofs of vulnerability

  • Valgrind AddressSanitizer KLEE Astrée TEMU BitBlaze VMware CoreDet SyncFinder ESD Parfait Saturn Calysto LLBMC Isabelle

    CUTE EXE SJPF Coverity CodeSonar DTrace Oprofile BAP IDApro Jakstab Angr SimOS Pin PinOS LFI

    ThreadSanitizer Manticore Dingo SFI Nooks CFI Dimmunix VeriSoft Eraser SAGE DART Cloud9 BitScope bddbddb …

    Testing

    Verification

    Profiling

    Simulation Isolation

    Immunit

    y

    Debugging

    Check properties on 
execution paths

  • ./prog

    argc != 2

    ./prog 123

    argc == 2

    int main(argc, argv) { if (argc == 2) { printf(“%c”, *argv[2]); … } return 0; }

    Checking Properties on Execution Paths

  • Enumerating All Execution Paths

    int main(argc, argv) { void *p = malloc(…);
 if (!p) { exit(-1); } … return 0; }

    Program

    Kernel

    Libraries

    Hardware

    ~2 pathssystem size

    Off-limits

  • Environment Modelling

    int main(argc, argv) { void *p = malloc(…);
 if (!p) { exit(-1); } … return 0; }

    Program

    Environment 
Model

    ~ 2 pathsprogram sizeKLEE Cloud9 SLAM LLBMC Coverity …

    Impractical

  • Calling the Environment

    int main(argc, argv) { void *p = malloc(…);
 if (!p) { exit(-1); } … return 0; }

    Program

    Kernel

    Libraries

    Hardware

    False negatives

    False positives

    KLEE EXE DART Fuzzing …

    ~ 2 pathsprogram size

  • Analysis Tools Make Trade-offs

    • Accuracy vs. performance
False positives vs. false negatives

    • Types of analysis
Testing, verification, profiling, etc.

    • Software stack level
Applications, libraries, kernel modules, etc.

    • Source code vs. binaries

  • S2E is a Platform that Enables Flexible Trade-offs

    • Accuracy vs. performance 
Symbolic execution, state merging, fuzzing, etc. 
You can write models, but don’t have to!

    • Types of analysis
Wide range of analyses

    • Software stack level
In-vivo, at any level of software stack

    • Source code vs. binaries 
Works on binaries

  • Kernel

    Libraries

    Hardware

    Applications

    Distributed systemsReverse


    engineering

    RevNIC [EUROSYS’10]

    Profiling

    PROFs [ASPLOS’11]

    Testing

    BIOS Testing [WOOT’15] Avatar [NDSS’14] CHEF [ASPLOS’14] Achilles [ASPLOS’13] SWIFT [EUROSYS’12] SymDrive [OSDI’12] SymNet [WRIPE’12] DDT [USENIX’11] …

    VerificationSoftware router
dataplanes [NSDI’14]

    Security

    CVE-2015-1536

  • Analysis 
Plugins

    Symbolic Execution

    Engine

    Dynamic
Binary


    Translator

    Instrumentation Engine

    Path Selection
Plugins

    S2E

    Applications

    Libraries

    Kernel Drivers

    Virtual Hardware

    VM

    What input to make symbolicWhat input to make concreteSearch heuristics

    Check for crashes, vulnerability conditions, performance metrics, etc.

  • • S2E uses QEMU

    • S2E and QEMU are completely decoupled

    • S2E is contained in libs2e.so

    • libs2e.so intercepts and replaces /dev/kvm functionality

    • Need a few simple KVM extensions to intercept DMA, disk R/W, and device state snapshotting

    • You don’t have to use QEMU with S2E

    KVM Extensions for Symbolic Execution

    Analysis 
Plugins

    Symbolic Execution

    Engine

    Dynamic
Binary


    Translator

    Instrumentation Engine

    Path Selection
Plugins

    S2E

    Applications

    Libraries

    Kernel Drivers

    Virtual Hardware

    VMli

    bs2e

    .so

    KVM-compatible interface/dev/kvm

  • • We refactored QEMU’s translator to make it standalone

    • libcpu, libtcg: code translation and generation libraries

    • libs2ecore, libs2eplugins, klee, libvmi, etc.

    • You can reuse these in your own projects

    • You can swap out the symbolic execution engine with your own if you want

    Modular Architecture

    Analysis 
Plugins

    Symbolic Execution

    Engine

    Dynamic
Binary


    Translator

    Instrumentation Engine

    Path Selection
Plugins

    S2E

    Applications

    Libraries

    Kernel Drivers

    Virtual Hardware

    VMli

    bs2e

    .so

    KVM-compatible interface/dev/kvm

  • Dynamic Binary Translation

    0x80000000: mov [ebx], eax

    void tb_0x80000000(cpu) { tmp1 = cpu->regs[EBX]; tmp2 = cpu->regs[EAX];

    __stl_mmu(tmp1, tmp2); }

    while(true) { tb = translate(cpu->pc) tb->func(cpu); }}

  • Dynamic Binary Translation

    0x80000000: mov [ebx], eax

    void tb_0x80000000(cpu) { tmp1 = cpu->regs[EBX]; tmp2 = cpu->regs[EAX];

    __stl_mmu(tmp1, tmp2); }

    Host-independent micro-operations

    Frontend

    Host instructions
(x86, arm, mips, etc.)

    Backend
(one per target 
architecture)

    while(true) { tb = translate(cpu->pc) tb->func(cpu); }} translate()

  • Dynamic Binary Translation

    translate(pc) { do { ins = disassemble(pc); 
 emit_uops(ins); pc += ins.size; } while (ins != jmp); }}

    while(true) { tb = translate(cpu->pc) tb->func(cpu); }}

    translate(pc) { do { if (s2e_instrument_ins(pc)) {
 emit_uops_s2e(); }} ins = disas(pc); emit_uops(ins); pc += ins.size; } while (ins != jmp); }}

  • Code Instrumentationtranslate(pc) { do { ins = disassemble(pc); 
 emit_uops(ins); pc += ins.size; } while (ins != jmp); }}

    translate(pc) { do { if (s2e_instrument_ins(pc)) {
 emit_uops_s2e(); }} ins = disas(pc); emit_uops(ins); pc += ins.size; } while (ins != jmp); }}

    0x80000000: mov [ebx], eax

    void tb_0x80000000(cpu) { s2e_call_plugins();

    tmp1 = cpu->regs[R_EBX]; tmp2 = cpu->regs[R_EAX];

    __stl_mmu(tmp1, tmp2); }

    void tb_0x80000000(cpu) { tmp1 = cpu->regs[EBX]; tmp2 = cpu->regs[EAX];

    __stl_mmu(tmp1, tmp2); }

  • Dynamic Binary Translation

    0x80000000: mov [ebx], eax

    void tb_0x80000000(cpu) { tmp1 = cpu->regs[EBX]; tmp2 = cpu->regs[EAX];

    __stl_mmu(tmp1, tmp2); }

    Host-independent micro-operations

    Frontend

    Host instructions
(x86, arm, mips, etc.)

    Backend
(one per target 
architecture)

    while(true) { tb = translate(cpu->pc) tb->func(cpu); }} translate()

    +LLVM

  • define i64 @tb_0x80000000(i64*) #12 { entry: %loc_18ptr = alloca i32 %loc_19ptr = alloca i32 %1 = getelementptr i64, i64* %0, i32 0 %2 = load i64, i64* %1 %eax_ptr = getelementptr %struct.CPUX86State, …, i32 0, i32 0 %ebx_ptr = getelementptr %struct.CPUX86State, …, i32 0, i32 2 %eax = load i32, i32* %eax_ptr %ebx = load i32, i32* %ebx_ptr call void @__stl_mmu(i32 %ebx, i32 %eax) … }

    Symbolic Execution0x80000000: mov [ebx], eax

    KLEE

  • Applications

    Libraries

    Kernel Drivers

    Analysis 
Plugins

    Symbolic Execution

    Engine

    Dynamic
Binary


    Translator

    Instrumentation Engine

    Virtual Hardware

    Path Selection
Plugins

    S2E

    VM

    Core execution events

    onTranslateInstructionStart onMemoryAccess onTimer onStateFork onStateKill onCustomInstruction …

    Writing Plugins

  • void MyPlugin::initialize() { s2e()->getCorePlugin()->onTranslateInstructionStart->
 connect(MyPlugin::onInstructionTranslation); }

    void MyPlugin::onInstructionTranslation(signal, pc) { if (pc == crashHandlerPc) { signal->connect(MyPlugin::onInstructionExecution, …); } }

    void MyPlugin::onInstructionExecution(state) { s2e()->getExecutor()->terminateState(…); }

    Where do I get the crash handler pc? Is it going to change between OSes? Where do I get information about the crash (pid, pc, etc.)? Is the crash handler pc in the correct address space?

    => Requires complex virtual machine introspection

  • void MyPlugin::initialize() {}

    void MyPlugin::handleOpcodeInvocation(state, data, size) { my_plugin_data_t data; s2e()->mem()->read(&data, data, size…); s2e()->getExecutor()->terminateState(data.pid, “crashed”); }}

    // linux-4.9.3/arch/x86/mm/fault.c static void force_sig_info_fault(int si_signo, int si_code, unsigned long address, …) { unsigned lsb = 0; siginfo_t info;

    my_plugin_data_t data; data.pid = current->pid; data.address = address; s2e_invoke_plugin(“MyPlugin”, &data, sizeof(data); … }}

    Do as much work as possible inside the guest!

  • Applications

    Libraries

    Kernel Drivers

    Analysis 
Plugins

    Symbolic Execution

    Engine

    Dynamic
Binary


    Translator

    Instrumentation Engine

    Virtual Hardware

    Path Selection
Plugins

    S2E

    VM

    s2e_invoke_plugin();

    OS Monitoring Plugins

    onProcessLoad() onProcessExit() onThreadCreate() onThreadExit() onSegFault() …

    S2E Guest Tools

    #!/bin/bash

    s2ecmd kill -1 “error” s2eget file.txt …

  • Overview

    • Overview of S2E and symbolic execution

    • Getting started with S2E 
Finding simple crashes and getting code coverage

    • Design and Implementation

    • Automated generation of proofs of vulnerability

    Finding vulnerabilities in user-space apps with S2E

  • What is a proof of vulnerability (PoV)?

    • Input to the program that can control the program’s state arbitrarily

    • Set the program counter and a general purpose register to an arbitrary agreed upon value

    • Exfiltrate data from memory

    • Useful for developers to triage bugs (a crash in itself may not be exploitable).

    $ gdb —args ./vulnerable-program pov_input.txt SEGFAULT at 0xdeadbeef

  • int main(int argc, char **argv) { char buf[4]; receive(buf, 12); return 0; }

    pushebpmovebp,espsubesp,4;Allocate4bytesonthestackforbufleaeax,[ebp-4];Computeaddressofbufpush12;Push12for2ndparameterofreceivepusheax;Pushaddressofbuffor1stparamofreceivecallreceivexoreax,eax;Setreturnvalueto0leave;Cleanthestackframeret;Returnfromthemainfunction

    echo AAAAAAAABBBBBBBBCCCCCCCC | xxd -r -p | ./program

    Segfault at address 0xCCCCCCCC

  • pushargvpushargccallmain

    main:pushebpmovebp,espsubesp,4leaeax,[ebp-4]push12pusheaxcallreceivexoreax,eaxleaveret

    0xf8:0xabc0;argv0xf4:0xdef0;argc0xf0:0x800231;returnaddress0xec:0x1000;savedframepointer0xe8:????????;spaceforthebuffer0xe4:12;sizeoftheparameterpassedtoreceive0xe0:0xe8;addressofthebufferonthestack

    0xf8:0xabc0;argv0xf4:0xdef0;argc0xf0:CCCCCCCC;returnaddress0xec:BBBBBBBB;savedframepointer0xe8:AAAAAAAA;spaceforthebuffer0xe4:12;sizeoftheparameterpassedtoreceive0xe0:0xe8;addressofthebufferonthestack

  • int main(int argc, char **argv) { char buf[4]; receive(buf, 12); return 0; }

    echo λ0λ1λ2λ3λ4λ5λ6λ7λ8λ9λ10λ11 | ./program

    Segfault at address λ11λ10λ9λ8

  • pushargvpushargccallmain

    main:pushebpmovebp,espsubesp,4leaeax,[ebp-4]push12pusheaxcallreceivexoreax,eaxleaveret

    0xf8:0xabc0;argv0xf4:0xdef0;argc0xf0:0x800231;returnaddress0xec:0x1000;savedframepointer0xe8:????????;spaceforthebuffer0xe4:12;sizeoftheparameterpassedtoreceive0xe0:0xe8;addressofthebufferonthestack

    0xf8:0xabc0;argv0xf4:0xdef0;argc0xf0:λ11λ10λ9λ8;returnaddress0xec:λ4λ5λ6λ7;savedframepointer0xe8:λ0λ1λ4λ3;spaceforthebuffer0xe4:12;sizeoftheparameterpassedtoreceive0xe0:0xe8;addressofthebufferonthestack

  • 0xf8:0xabc0;argv0xf4:0xdef0;argc0xf0:λ11λ10λ9λ8;returnaddress0xec:λ4λ5λ6λ7;savedframepointer0xe8:λ0λ1λ4λ3;spaceforthebuffer0xe4:12;sizeoftheparameterpassedtoreceive0xe0:0xe8;addressofthebufferonthestack

    onSymbolicAddress(S2EExecutionState *state,
 klee::ref address, …)

    pushargvpushargccallmain

    main:pushebpmovebp,espsubesp,4leaeax,[ebp-4]push12pusheaxcallreceivexoreax,eaxleaveret

    1. Add constraint address == 0xbadcafe 2. Solve constraints 3. Get concrete test case

    Segfault at address 0xbadcafe

    echo 0000000000000000fecaad0b | xxd -r -p | ./program

  • Practice Time!ssh -CX [email protected]

    $ wget -O vulnerable.c https://pastebin.com/raw/rBYFGEJS

    $ gcc -g -w -fno-stack-protector -D_FORTIFY_SOURCE=0 -O0 -m32 \ -o ./vulnerable vulnerable.c

    $ source s2e/s2e_activate $ s2e new_project —-enable-pov-generation ./vulnerable @@ $ cd s2e/projects/vulnerable $ ./launch-s2e.sh …

    mailto:[email protected]://pastebin.com/raw/rBYFGEJS

  • Testcases for 
PoVs and Crashes

    $ls~/s2e/projects/vulnerable/s2e-last_tmp_input-crash@0x0-pov-unknown-0_tmp_input-crash@0x0-pov-unknown-3_tmp_input-recipe-type1_i386_generic_reg_ebp.rcp@0x8048760-pov-type1-0_tmp_input-recipe-type1_i386_generic_reg_edi.rcp@0x8048760-pov-type1-0_tmp_input-recipe-type1_i386_generic_shellcode_eax.rcp@0x804883e-pov-type1-3...

  • Validating PoVs$gdb--args./vulnerables2e-last/_tmp_input-recipe-type1_i386_generic_shellcode_eax.rcp@0x804883e-pov-type1-3(gdb)r

    Startingprogram:/home/user/s2e/env/projects/vuln-lin32-32/vulne…

    Demoingfunctionpointeroverwrite

    ProgramreceivedsignalSIGSEGV,Segmentationfault.0x44556677in??()(gdb)inforegisterseax0xccddeeff-857870593ecx0x445566771146447479edx0x4064ebx0x00esp0xffffcd1c0xffffcd1cebp0xffffcd680xffffcd68esi0x804b008134524936…

  • Controlling Register Values

    ---------------------------------------------------------------------------------ThispluginwritesPoVsasinputfiles.Thisissuitableforprogramsthat--taketheirinputsfromfiles(insteadofstdinorothermethods).add_plugin("FilePovGenerator")pluginsConfig.FilePovGenerator={--ThegeneratedPoVwillsettheprogramcounter--ofthevulnerableprogramtothisvaluetarget_pc=0x0011223344556677,

    --ThegeneratedPoVwillsetageneralpurposeregister--ofthevulnerableprogramtothisvalue.target_gp=0x8899aabbccddeeff}

    s2e-config.lua

  • Recipes for PoVs• Recipes tell S2E how to constrain the input in order

    to get interesting PoVs

    • The recipe plugin monitors execution for symbolic pointers

    • When the recipe plugin detects a symbolic pointers, it looks at the available recipes and applies those that satisfy constraints

  • $ ls ~/s2e/projects/vulnerable/recipe type1_i386_generic_reg_eax.rcp type1_i386_generic_reg_ebp.rcp type1_i386_generic_reg_esi.rcp … type1_i386_generic_reg_esp.rcp type1_i386_generic_shellcode_eax.rcp type1_i386_generic_shellcode_ebp.rcp

    Recipes for PoVs

  • # Set GP and EIP with shellcode # mov eax, $gp # mov ecx, $pc # jmp ecx :type=1 :reg_mask=0xffffffff :pc_mask=0xffffffff :arch=i386 :platform=generic :gp=EAX :exec_mem=EIP [EIP+0] == 0xb8 [EIP+1] == $gp[0] [EIP+2] == $gp[1] [EIP+3] == $gp[2] [EIP+4] == $gp[3] [EIP+5] == 0xb9 [EIP+6] == $pc[0] [EIP+7] == $pc[1] [EIP+8] == $pc[2] [EIP+9] == $pc[3] [EIP+10] == 0xff [EIP+11] == 0xe

  • # Set GP and EIP with shellcode # mov eax, $gp # mov ecx, $pc # jmp ecx :reg_mask=0xffffffff :pc_mask=0xffffffff :type=1 :arch=i386 :platform=generic :gp=EAX :exec_mem=EIP [EIP+0] == 0xb8 [EIP+1] == $gp[0] [EIP+2] == $gp[1] [EIP+3] == $gp[2] [EIP+4] == $gp[3] [EIP+5] == 0xb9 [EIP+6] == $pc[0] [EIP+7] == $pc[1] [EIP+8] == $pc[2] [EIP+9] == $pc[3] [EIP+10] == 0xff [EIP+11] == 0xe

    [EIP+0] == 0xb8 [EIP+1] == 0xaa [EIP+2] == 0xaa [EIP+3] == 0xaa [EIP+4] == 0xaa [EIP+5] == 0xb9 [EIP+6] == 0xbb [EIP+7] == 0xbb [EIP+8] == 0xbb [EIP+9] == 0xbb [EIP+10] == 0xff [EIP+11] == 0xe

    EIP=λ11λ10λ9λ8

    Look for address ranges that are - executable - contain symbolic dataWhen such a range is found (e.g, at address 0xb8001200) - add constraint EIP == 0xb8001200 - add constraints [EIP+i] == 0x…If constraints satisfiable => we have a PoV

  • Challenges• Unreplayable PoVs


    Randomness (/dev/random, ASLR, etc.)

    • Arbitrary memory reads and writes
Require plugin support to track interesting memory regions

    • Interactive CGC-style PoVs
We have seen file-based PoVs so far

    More details on s2e.systems/docs

  • Conclusion

    • S2E enables flexible trade-offs 
Source/binary, environment, SW stack level, etc.

    • Using S2E to generate proofs of vulnerabilities

    https://s2e.systemsReady-for-use docker image, demos,
tutorials, source code, documentation

    We are hir

    ing!