i know where you've been: geo-‐inference a*acks via the browser
TRANSCRIPT
I Know Where You’ve Been: Geo-‐Inference A6acks via the
Browser Cache
Yaoqi JIA
Department of Computer Science Na2onal University of Singapore
Do You Care About your Geo-‐locaCon?
Video:How to Infer Your Geo-‐locaCon without Your Consent
Our Agenda
Ø Background of geo-‐loca2ons in browsers, browser cache, and 2ming channels
Ø Geo-‐inference aBacks via the browser cache
Ø Prevalence of geo-‐inference aBacks
Ø Pros & cons of poten2al solu2ons
Ø Demo Video for aBacks in TorBrowser
Ø Q & A
4!
Geo-‐locaCon in Browsers
5!
Geo-‐locaCon in Browsers
6!
Geo-‐locaCon in Browsers: Benefits & Threats
Benefits Threats
7!
May I Access Your Geo-‐locaCon?
8!
Sources of Users’ Geo-‐locaCons
Browser
Not reliable
9!
Problem Statement
Browser
?
Can the aBacker infer the user’s geo-‐loca2on from his browser?
10!
Background: Browser Cache
11!
Browser
Web Application
Parser④
③
① ② Network Module
Cache
DirecCves in Response Headers to Control Cache
Ø StaCc resources: Ø Expires, Cache-‐Control: max-‐age, Last-‐Modified
Ø Dynamic and sensiCve resources: Ø Cache-‐Control: no-‐cache, no store; Pragma:
no-‐cache; Expires: 0
12!
Browser Cache Stores StaCc Resources
Browser!
Browser stores site-‐related states
13!
Benefits of Browser Cache
Browser Cache!
1st: 1360ms 2nd: 320ms 3rd: 350ms
Save Time!
14!
Timing Channels via the Browser Cache
Browser Cache!
1st: 1360ms 2nd: 320ms 3rd: 350ms
15!
Geo-‐Inference A6acks via the Browser Cache
Browser Cache!
Infer users’ geo-‐loca2ons!
Browser cache is shared
across all sites
16!
Our A6acks: Infer a User’s Geo-‐locaCon without the Manual Input, Accessing GPS
Sensors or IP Addresses
17!
What are the Techniques to Determine the Cache Status of
Targeted Resources?
18!
A6ack Vector (I) : Measuring Image Load Time
var image = document.createElement(`img');
image.setAttribute(`startTime', (new Date().getTime()));
image.onload = function()
{
var endTime = new Date().getTime();
var loadTime = endTime - parseInt(this.getAttribute(`startTime'));
......
}
Before Loading img.onload Fires
19!
aBacker.com
A6ack Vector (II) : Measuring Page Load Time
var page = document.createElement(`iframe');
page.setAttribute(`startTime', (new Date()).getTime());
page.onload = function ()
{
var endTime = (new Date()).getTime();
var loadTime = ( endTime - parseInt(this.getAttribute(`startTime')));
......
}
Before Loading iframe.onload Fires
20!
aBacker.com
A6ack Vector (III) :Measure the Load Time of XMLH6pRequests
var starTime, endTime, loadTime;
var xmlhttp = new XMLHttpRequest();
xmlhttp.onloadstart = function(){
startTime = (new Date()).getTime();
}
xmlhttp.onloadend = function(){
endTime = (new Date()).getTime();
loadTime = endTime - startTime;
......}
onloadstart Fires onloadend Fires
21!
aBacker.com
A6ack Vector (IV) : Use <img>’s complete Property
function cached(url)
{
var image = document.createElement(`img');
image.src = url;
return image.complete || image.width+image.height > 0;
} aBacker.com
22!
Examples: What Can We Achieve?
Ø User’s country?
Ø User’s city?
Ø User’s streets or neighborhood?
23!
How to Infer a User’s Country? (I)
• Google has 191 regional sites. • One site represents one
country or region.
24!
google.com.sg/images/srpr/logo11w.png
How to Infer a User’s Country? (II)
Browser Cache!
25!
Cached!
How to Infer a User’s City? (I)
• Craigslist provides local classifieds adver2sements and forums for jobs, housing, etc.
• Craigslist has 712 city-‐specific sites. • Users buy or sell second-‐hand stuff in their Craigslist’s city-‐specific sites.
26!
How to Infer a User’s City? (II)
Browser Cache!
27!
chicago.craigslist.org
s^ay.craigslist.org
newyork.craigslist.org
tokyo.craigslist.jp
singapore.craigslist. com.sg
Cached!
How to Infer a User’s Neighborhood?(I) Predictable URLs
28!
Map Tiles
How to Infer a User’s Neighborhood? (II)
Browser Cache!
29!
Cached!
EvaluaCon
Ques2ons to be answered:
Ø (Prevalence) How many websites and browsers can be u2lized to conduct aBacks?
Ø (Reliability) How big is the 2me difference between the loading 2me of resources without cache and that with cache?
30!
EvaluaCon Setup
Ø Websites: 191 Google’s sites, 100 Craigslist’s sites, and 55 top Alexa sites.
Ø Maps: Google Maps, and other 10 map service sites.
Ø Browsers: Five mainstream browsers and TorBrowser
Ø Loca2ons: US, UK, Australia, Singapore, and Japan.
31!
How Many Websites and Browsers can be UClized to Conduct A6acks?
32!
Alexa Top Websites with LocaCon-‐Related Resources
62% of 55 top Alexa global sites
33!
singapore.craigslist. com.sg
sg.yahoo.com
www.ebay.com.sg
Map Websites with LocaCon-‐Related Resources
All of 11 map service sites
34!
SuscepCble Browsers & Plaaorms
Mainstream Browsers
Desktop Plakorms Mobile Plakorms
35!
Par2al
How Significant is the Time Difference between the Loading Time of Resources without Cache and that with Cache?
36!
Loading Time: Without Cache v.s. With Cache I
Difference in image load /me (in millisecond): Without Cache (> 129 ms) v.s. With Cache (0 ∼ 1 ms), for 191 Google’s regional domains in Chrome on Mac OS X
37!
0
200
400
600
800
1000
1200
1 7 13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
103
109
115
121
127
133
139
145
151
157
163
169
175
181
187
Without Cache
With Cache
120ms
Loading Time: Without Cache v.s. With Cache II
The significant difference between the page load 2me (in millisecond) of 100 Craigslist sites without cache (> 1000 ms) and with cache (≈ 220 ms) indicates geo-‐inference aBacks with Craigslist
0
500
1000
1500
2000
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
Without Cache With Cache
38!
700ms
Loading Time: Without Cache v.s. With Cache III
Difference in page load /me (in millisecond): Without Cache (> 50 ms) v.s. With Cache (0 ∼ 1 ms), for 4,646 map 2les of New York City from Google Maps in Chrome on Mac OS X. 39!
0
50
100
150
200
250
1 127
253
379
505
631
757
883
1009
1135
1261
1387
1513
1639
1765
1891
2017
2143
2269
2395
2521
2647
2773
2899
3025
3151
3277
3403
3529
3655
3781
3907
4033
4159
4285
4411
4537
Without Cache
With Cache
50ms
Loading Time (Android)
The page load 2me of 100 Craigslist sites on Android.
0
500
1000
1500
2000
2500
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100
Without Cache With Cache
40!
700ms
How to Protect Users from Geo-‐inference A6acks
Discussion of Defense SoluCons
Ø Private Browsing Mode
Ø Randomizing 2ming measurements
Ø TorBrowser and Segrega2ng browser cache
42!
Private Browsing Mode
Private Browsing Mode is not the Cure
Ø Clear browser cache aser closing the window.
Ø Disable disk cache, not the in-‐ memory cache.
Ø It cannot prevent one site from inferring the user’s geo-‐loca2on from other sites.
43!
Browser Cache!
Randomizing Timing Measurements
Ø Add noise into 2ming measurement mechanisms.
Ø Affect web applica2ons’ func2onali2es
Ø Intricate engineering effort.
44!
Browser Cache!
TorBrowser is not Perfect
Ø Adds an addi2onal “id=string” property to label every cache entry with the top-‐level window’s domain.
Ø Insufficient for mashup websites, all the embedded sites in frames share the same top-‐level window’s domain, i.e., the mashup’s domain.
45!
Browser Cache!
Demo Video
46!
Video: Geo-‐inference A6acks in TorBrowser
47!
SegregaCng Browser Cache
Ø Deploy Same-‐Origin Policy on browser cache.
Ø We experimented in Chromium 34
Ø High performance overhead for Alexa Top 100 websites
48!
Browser Cache!
0%
100%
200%
300%
400%
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97
To Cache or Not To Cache?
Ø No cache for loca2on-‐sensi2ve resources (0.7% to 20.7% overhead).
Ø Cache-‐Control: no-‐cache for HTTP response header Ø Pre-‐fetch redundant loca2on-‐sensi2ve resources. Ø Open challenge to design an efficient and secure caching mechanism in browsers.
49!
Take-‐away
Ø Timing channels are s2ll open on mainstream browsers. Ø Knowing the power and prevalence of geo-‐inference aBack (inferring country, city, neighbourhood) and be cau2ous about it. Ø Disable cache? No JavaScript? Ø Never give addi2onal permissions to unfamiliar sites or open it for a long 2me. Ø Clear cache before and aser visi2ng a site with your private informa2on, e.g., online banking site.
50!
Yaoqi JIA E-‐mail: [email protected]
References
D. Akhawe, A. Barth, P. E. Lam, J. Mitchell, and D. Song, “Towards a formal founda2on of web security,” in Computer Security Founda/ons Symposium (CSF), 2010 23rd IEEE, 2010.
A. Bortz and D. Boneh, “Exposing private informa2on by 2ming web applica2ons,” in Proceedings of the 16th interna/onal conference on World Wide Web, 2007.
G. Wondracek, T. Holz, E. Kirda, and C. Kruegel, “A prac2cal aBack to de-‐anonymize social network users,” in Security and Privacy (SP), 2010 IEEE Symposium on, 2010.
Z. Weinberg, E. Y. Chen, P. R. Jayaraman, and C. Jackson, “I s2ll know what you visited last summer: Leaking browsing history via user interac2on and side channel aBacks,” in Security and Privacy (SP), 2011 IEEE Symposium on, 2011.
M. Jakobsson and S. Stamm, “Invasive browser sniffing and countermeasures,” in Proceedings of the 15th interna/onal conference on World Wide Web, 2006.
G. Aggarwal, E. Bursztein, C. Jackson, and D. Boneh, “An analysis of private browsing modes in modern browsers,” in Proceedings of the 19th USENIX Conference on Security, ser. USENIX Security’10, 2010.