d2taint: differentiated and dynamic information flow tracking on smartphones for numerous data...
TRANSCRIPT
D2Taint: Differentiated and Dynamic
Information Flow Tracking on Smartphones for Numerous Data
Sources
Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion, Zhezhe Chen, Feng Qin,
and Dong XuanThe Ohio State University
Infocom 2013
Outline
Introduction
Background
Differentiated and Dynamic Tagging
IFT with Dynamic and Differentiated Tagging
Evaluation Method & Experimental Results
Conclusions
Introduction
Trend Micro reports that over 25,000 Android malware samples were found in June 2012 alone
46%-55% of smartphone apps transmit users’ private information over networks without users’ awareness or consent
Introduction
TaintDroid
extend Dalvik VM to tag smartphone data using 32 possible types based on their origins
Just 32 origins ...
Introduction
D2Taint
track sensitive data from a large number of possible internal and external sources
partition sources into disjoint classes (correspond to different sensitivities)
tag structure updates itself on-the-fly
Background
Information Flow Tracking Basics
Compiler analysis on programs written in special type-safe programming languages
Software instrumentation at source code, bytecode, or binary level
Architecture support for IFT
Background
Pre-defined structure
sources IDs or sensitivity level
Tag propagation policy
a = b + c --> a.tag = max(b.tag, c.tag)
Tag checking
“Passwords” may not be sent over network
TaintDroid
TaintDroid
TaintDroid
Dalvik opcode: http://pallergabor.uw.hu/androidblog/dalvik_opcodes.html
Differentiated and Dynamic Tagging
Source level
different info sources may have different sensitivities in terms of security
Differentiated and Dynamic Tagging
Application level
different amounts of storage space to capture heterogeneous sources and correlations
Differentiated and Dynamic Tagging
User level
adapt to changing information source access patterns
Differentiated and Dynamic Tagging
By examining the source level and application level
Differentiated classes
By examining the user level
Tag dynamics
Differentiated and Dynamic Tagging
Tag structure
Tag Structure
Tag Tag scheme IDscheme ID Class 1Class 1 Class 2Class 2 Class 3Class 3
Class 1 TableClass 1 Table0001 google.com0001 google.com0010 yahoo.com0010 yahoo.com
Tag schemeTag scheme00 2/3/100 2/3/101 2/2/201 2/2/210 4/1/110 4/1/111 3/2/111 3/2/1
Tag Structures Examples
32 bits, 1 class for each bit: TaintDroid
32 bits, 2-bit tag scheme ID, 3 classes, 16/8/6 bits per class, 4/4/2 bits per source
32 bits, 2-bit tag scheme ID, 2 classes, 24/6 bits per class, 3/2 bits per source
Tag Dynamics
Each class can have different length at different times
Perform “on-demand” machine learning based on statistical properties of tag space usage and location information tables’ recent hash values
Adjust its tag structure
Tag Dynamics Issues
Tag Scheme Switching
tag scheme config -> preconfigured
when to switch tag scheme -> on-the-fly
Tag merging
IFT with Dynamic and Differentiated Tagging
IFT with Dynamic and Differentiated Tagging
When an app start, the dynamic tagging component first loads two configuration files
tag structure definitions
user-defined classes and known data sources
IFT with Dynamic and Differentiated Tagging
The dynamic tagging component checks the data source list for each incoming data source by the tag assigner
and tracks incoming sources’ statistics and determines whether it should switch D2Taint to a different tag scheme
Dynamic Tagging Component
Dynamic Tagging Core
tag scheme settings: scheme number, bits per tag, number of classes, and a pointer to the class list
class structure: number of classes in the tag system, bits per hashcode, number of reserved slots for the class, and a text description of the class
Dynamic Tagging Component
Dynamic Tagging Core
use a global location information table list to record all source information
after a certain number of new sources are added into an location information table, D2Taint decides whether to switch the tag scheme based on these new sources
Dynamic Tagging Component
Tag Merger
a.tag = b.tag ⊕ c.tag
if using the same tag scheme?
if using the different tag scheme?
truncate certain significant bits
Information Flow Tracking Component
Taint Map
we do not store tags of method local variables, method arguments, and class instance fields adjacent to them in memory
Taint MapMethod local variables and method arguments
when Dalvik VM allocates a stack frame for a method, our system allocates a stack taint map for it
Class instance fields
to be stored in objects’ taint maps
objects’ taint map is stored immediately after that allocated for the object
Information Flow Tracking Component
Tag Assigner
insert our tag assigner logic into file I/O, network I/O, sensor, and other library functions that read private information
Information Flow Tracking Component
Tag Propagator
interpreted code and native code: same as TaintDroid
also propagate tags via Binder IPC
Information Flow Tracking Component
Tag Checker
trustable sites
Evaluation Method & Experimental Results
Android 2.2 on Nexus One
Select 84 “top free” apps from Google Play
CaffeineMark for benchmark
Evaluation Method & Experimental Results
Real-world
71 out of 84 apps leak information
reveal the paths by which the information is leaked
33 apps transmit data among many various external sources
12 apps leak devices’ IMEIs/EIDs
Evaluation Method & Experimental Results
Performance
-> 9%-> 7.3%-> 16%-> 3%-> 13%
-> 21%
Evaluation Method & Experimental Results
Java Macrobenchmark
Evaluation Method & Experimental Results
CaffeineMark’s memory footprint
Android: 21664 KB
D2Taint: 22528 KB
this test ignored the memory used by location information table, which will dynamically increases as more information sources arrived
Evaluation Method & Experimental Results
Sequential websites
dynamic static
Evaluation Method & Experimental Results
Random websites
dynamic static
Conclusions
A novel IFT tagging strategy
using differentiated and dynamic tagging
dynamic tag structure