[cb16] cofi break – breaking exploits with processor trace and practical control flow integrity by...
TRANSCRIPT
![Page 1: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/1.jpg)
Anti exploitation and Control Flow Integrity with Processor Trace
![Page 2: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/2.jpg)
Brought to you by
Shlomi Obermanindependent security
researcher
Ron Shinaindependent security
researcher
![Page 3: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/3.jpg)
Tracing – what executed and when?
Code optimization and profiling◦Sampling◦Instrumentation
Intel Processor Trace (PT)
![Page 4: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/4.jpg)
Intel PTProcessor feature enabling instruction
tracing with low overhead – documentation says about 5%◦Tens of times faster than the previous option
Available on Intel Broadwell and Skylake processors
A similar feature, Real Time Instruction Trace, exists on certain Intel Atom processors
![Page 5: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/5.jpg)
Intel PT
![Page 6: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/6.jpg)
PacketsProcessor writes trace to memory as packets
Packet Types◦ Taken / Not Taken packets for conditional branches◦ IP packets for indirect branches◦ Timestamp packets◦ …
Binary is needed to recreate the instruction trace
![Page 7: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/7.jpg)
call to foo
branch taken / not taken
Decoded Trace Packets
![Page 8: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/8.jpg)
User and or Kernel tracing
Filter by process
Starting or stopping the trace based on address ranges (only in later processors)
Configuration options
![Page 9: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/9.jpg)
Atom processors supporting RTIT – tracing guests possible, but not the hypervisor
Broadwell – no support at all
Skylake – full support
Tracing VM guests and hypervisors
![Page 10: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/10.jpg)
+ Traced Program’s Binary
Instruction Trace
Intel PT output
![Page 11: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/11.jpg)
Linux kernel 4.1 comes with integrated PT supportLinux kernel 4.3 supports tracing using perf user tools
An open source PT decoding library – libipt
Gdb 7.10 supports using PT for tracing
simple-pt – an open source implementation of PT on Linux(used to create the trace pictures on the previous slide)
* processor supporting PT included separately ;)
Want to use Processor Trace right now? *
![Page 12: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/12.jpg)
Exploitation and the NX Bit
Hi!
shellcode
When pdf is opened, the shellcode will be in memory that isn’t executable – NX bit
How do attackers run the code to make their shellcode executable?◦ Use code that is already executable (the
program’s code )
This exploitation technique comes in many forms, most notably, ROP – Return Oriented Programming
![Page 13: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/13.jpg)
Using executable memory already in the program usually involves moving around the process rather strangely
for example:
◦ Not returning to a function’s caller
◦ Calling addresses in the middle of functions, instead of at the beginning
◦ …
“Jump Around, Jump around…” / House of Pain
Hi!
shellcode
![Page 14: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/14.jpg)
Establish rules for how the code flows in the process◦ Functions return to their callers◦ Calls are made to the beginning of functions◦ …
How can those rules be enforced?◦ Add rule checking to the program’s binary◦ Trace the program while running and go over the log (this work)◦ Use other CPU features to detect “surprising” branches
“Control Flow Integrity Principles, Implementations, and Applications”, Abadi, Budiu, Erlingsson, Ligatti, 2005
Control Flow Integrity (CFI)
![Page 15: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/15.jpg)
“Security Breaches as PMU Deviation”, Yuan, Xing, Chen, Zang 2011
“kBouncer: Efficient and Transparent ROP Mitigation” – Pappas, Winner of Microsoft BlueHat competition 2012, uses previous CPU branch tracing capabilities
“CFIMon: Detecting Violation of Control Flow Integrity using Performance Counters” – Xia, Liu, Chen, Zang 2012
“Taming ROP on Sandy Bridge”, Wicherski of Crowdstrike, 2013
“Transparent ROP Detection using CPU Performance Counters”, Li, Crouse, THREADS 2014
and more…
Prior Work
![Page 16: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/16.jpg)
Anti exploitation system to scan files based on CFI (think pdf on Adobe Reader)
Detects whether “illegal” returns were made, like in ROP◦ Easy to add other CFI mitigations, such as checking the
targets of calls (no calls to the middle of functions, …)
(Soon to be) Open SourceDeveloped in 2015
Our Implementation
![Page 17: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/17.jpg)
Verifying CFI via Processor TraceWas the flow OK? Just follow the arrows
and calls using the PT generated packets
![Page 18: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/18.jpg)
What information is needed to follow the execution and verify it?
Control Flow Graph (CFG)◦ Location of functions◦ Location of basic blocks◦ …
Need this for all the libraries loaded by the process – Adobe Reader dlls, Windows dlls◦ If not – false positives
All we have is debugging symbols, pdb files, for the Windows binaries
![Page 19: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/19.jpg)
We used IDA to recover the CFG
IDA didn’t do a good enough job◦Part of the functions and basic blocks in Adobe
Reader / Windows binaries weren’t detected
Static Analysis
![Page 20: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/20.jpg)
When supporting a new version of Adobe Reader, IDA is used to get the initial CFG (static analysis)
Afterwards, many pdf files are traced with PT◦ When a new basic block or function is discovered while following the
trace – the CFG is updated
Repeat◦ run IDA on the new CFG◦ run the pdf files on IDA’s output◦ If the CFG was updated in the last iteration
Repeat
Dynamic Analysis
![Page 21: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/21.jpg)
Most of the edges in the CFG are:◦ Calls relative to the current IP (no
packet for those)◦ Conditional branches
When traversing the CFG during trace verification, fetching the next node in these cases has to be (very) fast
Since the CFG is fixed and built in preprocessing, this isn’t a problem
Optimization
![Page 22: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/22.jpg)
Ideally, no disassembly and CFG modification (slow) would be done during verification
However, some of the code analyzed is created dynamically – as long as it doesn’t change, this can be dealt with in preprocessing
In cases where it changes every time “Adobe Reader” is run to open a file, preprocessing isn’t enough◦ code is disassembled and CFG is updated
Optimization
![Page 23: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/23.jpg)
Following the execution trace is done on a per thread basis
How to know which thread was executing at each part of the trace?◦PT packets give timing information, but
only output the current process
Thread information
![Page 24: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/24.jpg)
Event Tracing for Windows (ETW)
◦It should be possible to get the thread context switching times from the CSwitch events provided by ETW as TSC
◦Then these timestamps could be synched with the TSC packets from PT to determine which thread was running in different parts of the trace
Thread Information
![Page 25: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/25.jpg)
What about getting a callback every time a thread in the traced process is switched in?
◦ AFAWK, no direct way
◦ We hooked the Windows context switch function - don’t do that
◦ Endgame presented a way to achieve this via Asynchronous Procedure Calls (Blackhat 2016)
Thread Information
![Page 26: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/26.jpg)
Need to know the executable memory ranges at all points in the trace – what modules are loaded
Knowing when the PT trace reached ntdll!LdrLoadDll and ntdll!LdrUnloadDll isn’t enough◦ Module name is needed to update the current memory
map
ETW was used to retrieve module load / unload name and time (tsc) and this is then synched with the times of the load/unload functions in the trace
Module load / unload
![Page 27: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/27.jpg)
For example:◦ Exception dispatching code◦ User mode callbacks◦ …
When going over the trace, when suspected mismatches occur, the above special cases are checked via binary signatures
This mostly needs to be done per operating system, not per-application
Still not done – functions don’t always return to their callers
![Page 28: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/28.jpg)
(almost entirely) Not dealt with by our implementation
For PT tracing the code being executed is needed One obvious problem is pages that get written to and
executed from simultaneously
(maybe) One could remove the write permission every time a page becomes writable and executable and handle the access violation when it gets written to, in order to obtain the code’s new version
Dynamically generated code
![Page 29: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/29.jpg)
A case of dynamically generated code that was dealt with:
Applications that hook themselves… with identical hooks, at the same locations and same time
To the trace verifier, the code is essentially static
Dynamically generated code
![Page 30: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/30.jpg)
Benign, non malicious files◦Run on 10000 pdf, 3000 ppt/x, 3000 doc/x without false positives
Malicious files containing a ROP chain◦Run on 5 such files, detecting the exploit and displaying the CFI violation
Scanning Results
![Page 31: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/31.jpg)
you’d still need◦Module load / unload information◦Thread context switch times
but could somewhat do without◦The CFG – a partial CFG can be built from the
trace (it doesn’t need to be built in advance)
Forget CFI and anti-exploitation…What if I just want to trace a process quickly with Processor Trace?
![Page 32: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/32.jpg)
Control-flow Enforcement Technology announced by Intel June 2016. Release date ?
Processors will directly support:◦Shadow (call) Stack tracking –unmatching return control protection exception
◦Indirect branch tracking – an indirect branch to a target containing an instruction different than ENDBRANCH control protection fault
Coming soon to a motherboard near you
![Page 33: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/33.jpg)
ARM has a feature similar to Processor Trace called CoreSight
Tracing on linux has been integrated with perfOpen source decoding library exists – OpenCSD
http://www.linaro.org/blog/core-dump/coresight-perf-and-the-opencsd-library/
What about tracing quickly on ARM?
![Page 34: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/34.jpg)
“Control Jujutsu” – Evans, Long, Otogonbaatar, Shrobe, Rinard, Okhravi, Stelios, CCS 2015
Uses indirect call sites with controllable targets and arguments (via vulnerability) to achieve arbitrary code execution (e.g., call exec or system)
Bypasses CFI because the target functions are legal in the CFG
Bypassing CFI
![Page 35: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/35.jpg)
“Write Once, Pwn Anywhere”, Yu, Black Hat USA 2014
◦Sometimes applications have security critical information in one variable
◦Pseudo-code from internet explorer’s javascript engine:
if (safemode & 0xB == 0) {turn_on_god_mode();}
Bypassing CFI with “data attacks”
![Page 36: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/36.jpg)
“Control Flow Bending”, Carlini, Barresi, Payer, Wagner, Gross, USENIX 2015
◦printf-oriented-programming – if you control the arguments, printf can do arbitrary computation
Bypassing CFI with “data attacks”
![Page 37: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/37.jpg)
“Data oriented programming” – Hu, Shinde, Sendroiu, Zheng, Prateek , Zhenkai, S&P 2016
goal: perform arbitrary computation while adhering to the CFG
Similar to ROP in spirit – use parts of the original program as “instructions” of a “VM” controlled by the attacker
“data gadgets” are used to perform computation on data
Bypassing CFI with “data attacks”
![Page 38: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/38.jpg)
gadgets are executed one after the other by using constructs already in the vulnerable program – such as loops
the vulnerability being exploited is used to determine which data gadget gets run and on what data
“data oriented programming” (cont)
![Page 39: [CB16] COFI break – Breaking exploits with Processor trace and Practical control flow integrity by Ron Shina & Shlomi Oberman](https://reader036.vdocuments.net/reader036/viewer/2022070515/587756e61a28ab84388b77d5/html5/thumbnails/39.jpg)
any questions?