cfimon: detecting violation of control flow integrity using performance counters yubin xia, yutao...
TRANSCRIPT
CFIMon: Detecting Violation of Control Flow Integrity
using Performance Counters
Yubin Xia, Yutao Liu, Haibo Chen, Binyu Zangin DSN 2012
A.C. Chen 2012/09/18 @ ADL
A.C. Chen 2012/09/18 @ ADL
Outline
• Introduction• Performance Monitoring Units (PMU)• CFI Enforcement by CFIMon• Implementation• Experiment• Performance• Conclusion
2
A.C. Chen 2012/09/18 @ ADL 3
INTRODUCTION
A.C. Chen 2012/09/18 @ ADL 4
Motivation
• Many classes of security exploits usually involve introducing abnormal control flow transfers– Code-injection attack– Code-Reuse Attacks
• return-into-libc (RILC)• return-oriented programming (ROP) • jump-oriented programming (JOP)
• Countermeasures– non-executable stacks– Stack-Guard– safe C library– heuristic means– ….– usually designed for a specific problem
A.C. Chen 2012/09/18 @ ADL
Some General Solutions…?
• Control flow integrity (CFI) [Abadi et al.]– statically rewrites a program + dynamic inlined guards
• Suffer from coverage problems
• Control flow locking [Tyler Bletsch et al.]– recompiles a program
• difficult to be applied to legacy applications
• Architectural support to validate or enforce control flow integrity [Shi et al.]– need to re-design existing processors
5
A.C. Chen 2012/09/18 @ ADL 6
In this Paper…
• Detect a set of attacks that cause abnormal control flow transfers --- CFIMon– without changes to existing hardware, source code or
binaries– leverage the hardware support for performance counters
to monitor the control flow integrity (CFI)
A.C. Chen 2012/09/18 @ ADL 7
PERFORMANCE MONITORING UNITS (PMU)Hardware support for performance monitoring
A.C. Chen 2012/09/18 @ ADL 8
Performance Monitoring Units (PMU)
• perfmon
A.C. Chen 2012/09/18 @ ADL 9
2 Working Modes of PMU
• Interrupt-based mode (basic mode)– lacks precise instruction pointer information
• the reported IP may be up to tens of instructions away from the actual IP (instruction pointer) causing the event
• Precision mode– improve the precision and flexibility of PMUs– e.g. techniques used in Intel CPU:
• PEBS: Precise Event-Based Sampling• BTS: Branch Trace Store• LBR: Last Branch Record• Event Filtering• Conditional Counting
A.C. Chen 2012/09/18 @ ADL 10
Precision Mode of Intel CPU---Branch Trace Store (BTS) Mechanism
• Record all control transfer precisely into a predefined buffer– jump, call, return, interrupt and exception– also record the addresses of branch source and target
• Let a monitor get the trace in a batch – an interrupt will be delivered when the buffer is nearly
full
• Obtain all the branch information of a running application, help users locate the vulnerabilities
A.C. Chen 2012/09/18 @ ADL 11
CFI ENFORCEMENT BY CFIMON
Offline Analysis and Online Detection
A.C. Chen 2012/09/18 @ ADL 12
Main Idea
• The CFI of an application can be maintained if we can – get a legal set of branch target addresses for every
branch– check whether the target address of every branch is
within the corresponding legal set at runtime
A.C. Chen 2012/09/18 @ ADL 13
Branch Classification in X86 ISA---Direct Branch & Its Target Address
• Direct Branch – Direct jump
• jnz c2ef0 <__write>– Direct call
• callq 34df0 <abort>
• Since the code is read-only and cannot be modified during runtime, both the direct jump and direct call are considered safe one
(safe branch) √
A.C. Chen 2012/09/18 @ ADL 14
Branch Classification in X86 ISA---Indirect Branch & Its Target Address
• Indirect Branch– Indirect jump
• jmpq *%rdx• not possible to gain the whole target address set just
by static analysis– Indirect call
• callq *%rax• its target address could be obtained by statically
scanning the binary code of the application and the libraries it uses
– Return• retq• its target address could also be obtained by scanning
the binary code.
(unsafe branch) √
A call can only transfer control to thestart of a function.
In general, the target address of a return has to be the one next to a call
Dynamic Training
A.C. Chen 2012/09/18 @ ADL 15
CFIMon: 2 Phases
• Offline phase– build a legal set of target addresses for each branch
instruction
• Online phase– diagnose possible attacks with legal sets following a
number of rules• determine the status of the branch as legal, illegal
or suspicious
A.C. Chen 2012/09/18 @ ADL 16
Offline Analysis--- obtain legal set: ret_set, call_set
• Scans the binary of application and dynamic libraries to get– ret_set
• contains all addresses of the instructions next to each call
• special cases– call_set
• contains all addresses of the first instruction of each function
.
.add(3,4);printf(“TEST!”);.. ret_set
int add(int a, int b){
printf(“1st inst.”);..} call_set
A.C. Chen 2012/09/18 @ ADL 17
Offline Analysis--- obtain legal set:train_set
• Use training to collect branches trace ( recorded by BTS ) for each indirect jump, get the legal set of– train_set– there could be corner cases which are not covered
• considered as suspicious during online checking
A.C. Chen 2012/09/18 @ ADL 18
Online Detection
<source,target>
special case?
<source> is directbranch?
legalillegalsuspicious
ret_set call_settrain_se
t
yes
no
yes
noyes
yes
noyes
no
no
<source> is indirect
call
<source> is
return
<source> is indirect jump
Consider the state of a branch depending on <target>
switch intodifferent cases based on <source>
slide-windowmechanism
A.C. Chen 2012/09/18 @ ADL 19
Slide-Window Mechanism ---For Suspicious Branches
• The diagnose module makes a flexible decision depending on the pattern of the branches– maintain a window of the states of recent n branches– apply a rule of tolerating at most m suspicious branches
in the recent n ones• i.e., at most m suspicious branches are accepted in
recent n branches
A.C. Chen 2012/09/18 @ ADL 20
IMPLEMENTATION
A.C. Chen 2012/09/18 @ ADL 21
Implementation
• Debian-6 with kernel version 2.6.34 – 2GB 1066MHz main memory– Intel Core i5 processor with 4 cores
• Based on perf_events to implement the CFIMon– a unified kernel extension in Linux for user-level
performance monitoring
A.C. Chen 2012/09/18 @ ADL 22
CFIMon---Mainly 2 Components
• A kernel extension – operate the performance samples– monitor signals – provide the interfaces to user-level tool
• A user-level tool with 2 modules– diagnose module
• check the control flow integrity• receives information from the OS to solve special
cases such as signal handling– control module
• initialize the environment • launch and synchronize with an application
A.C. Chen 2012/09/18 @ ADL 23
A kernel extension
Architecture
A user-level tool with 2 modules
A.C. Chen 2012/09/18 @ ADL 24
CFIMon---Monitoring
• The user-level tool is the parent process of the application process, executed as a monitoring process– use ptrace to synchronize with the application process– run for security check at the critical point
• e.g. when the child process makes the exec system call
A.C. Chen 2012/09/18 @ ADL 25
EVALUATION
Evaluate the detection ability of CFIMon
A.C. Chen 2012/09/18 @ ADL 26
Experimental Samples
• Use several real-world applications as well as 2 demo programs to detect– Code-Injection Attacks– Return-to-libc Attacks– Return-oriented Programming (Samba, GPSd, and Wu-
ftpd-2.6.0 excluded)
A.C. Chen 2012/09/18 @ ADL 27
Evaluation for Code-Injection Attacks
• Use the metasploit framework to generate nop-sled before the injected code– attack each application with injected code 5 times to test
the false negatives– CFIMon detects all these attacks as expected
• report a security alarm
• For example, code-injection attack of Samba– heap overflow function lsa_trans_name and overwrite
the function pointer destructor– CFIMon detected such attack since the branches have
never appeared in the train_set
post-attack diagnosis
A.C. Chen 2012/09/18 @ ADL 28
Evaluation for Return-to-libc Attacks
• CFIMon successfully detects all these attacks without experiencing false negatives
• Return-to-libc Attack of GPSd (ver. 2.7)– format string vulnerability in function gpsd_report– allows remote attackers to execute arbitrary libc function
(e.g. system ) via certain GPS requests (via tcp port 2947 )
– CFIMon marks it and the following branches as suspicious since the branches have never appeared in the train_set
– an alarm is triggered since the number of suspicious branches quickly exceeds the threshold
addr. of systemaddr. of …
.
.
suspicious
branches window size = 20tolerant at most 3 suspicious branches
A.C. Chen 2012/09/18 @ ADL 29
Evaluation for Return-oriented Programming Attacks
• Similar to other evaluation, CFIMon successfully detects all these attacks without experiencing false negatives
• Return-oriented Programming Attack of Squid (ver. 2.5-STABLE1)– stack overflow bug in its helper module, ntlm, when
authentication – smash the stack by supply arbitrary password of at most
300 bytes in function ntlm_check_auth – violates the rules of CFIMon which enforces that the
target address of a return instruction must be the one next to a call
A.C. Chen 2012/09/18 @ ADL 30
PERFORMANCE
Overhead evaluation
A.C. Chen 2012/09/18 @ ADL 31
Performance Evaluation
• Quantitatively evaluate the performance of CFIMon using several real-world applications– Apache– Exim– Memcached– Wu-ftpd
A.C. Chen 2012/09/18 @ ADL 32
Overhead Results
• Memory overhead is negligible– since the size of the tables ( ret_set, call_set and
train_set) is quite small
• Performance overhead
Average overhead of CFIMon is only 6.1%
Average overhead of pure BTS is 5.2%
A.C. Chen 2012/09/18 @ ADL 33
CONCLUSION
A.C. Chen 2012/09/18 @ ADL 34
Conclusion
• The proposed CFIMon leveraged the branch trace store (BTS) mechanism to detect violation of control flow integrity
• The performance result shows that CFIMon can be applied to some real-world server applications on off-the-shell systems in daily use
A.C. Chen 2012/09/18 @ ADL 35
Q & A
A.C. Chen 2012/09/18 @ ADL 36
Return-Without-Call
• There are several cases that the calling convention may be violated:– setjmp/longjmp
• Instead of returning to its own caller, the longjmp returns to the caller of setjmp (also a legal address)
– Unix signal handling• Instead of returning to the caller (OS), the handler
returns to the interrupted process • modify the OS to let the monitor omit the alarm when
a signal handler returns
second main
A.C. Chen 2012/09/18 @ ADL 37
Calling Convention
Stack Frameof A()
Stack Frame of B()
Stack Frame of C()
Stack Frame of D()
High addr.
Low addr.
A()
B()
C()
D()
A.C. Chen 2012/09/18 @ ADL 38
setjmp/longjmp
second main
A.C. Chen 2012/09/18 @ ADL 39
Precision Mode of Intel CPU---PEBS, BTS
• PEBS (Precise Event-Based Sampling)– Precise Performance Counter– atomic‐freeze: record exact IP address precisely
• BTS (Branch Trace Store)– to capture all control transfer events
• jump, call, return, interrupt and exception– also record the addresses of branch source and target– enables the monitoring of the whole control flow of an
application
A.C. Chen 2012/09/18 @ ADL 40
Precision Mode of Intel CPU---LBR, Event Filtering, Conditional Counting
• LBR (Last Branch Record)– to record the most recent branches into a register stack– the size of the register stack is small
• Event Filtering– to filter events not concerned with– currently only available in LBR not BTS
• Conditional Counting– to separate user-level events from kernel-level ones– only increment counter while the processor is running at
a specific privilege level• e.g. “only counting when at user mode”