cloak of visibility: detecting when machines browse a different web - ieee security and privacy 2016
TRANSCRIPT
![Page 1: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/1.jpg)
Cloak of Visibility: Detecting When Machines Browse a Different Web
Luca Invernizzi*, Kurt Thomas*, Alexandros Kapravelos†,
Oxana Comanescu*, Jean-Michel Picod*, and Elie Bursztein*
* Google - Anti-fraud and abuse research † North Carolina State University
![Page 2: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/2.jpg)
Web cloaking
Cloaking site
![Page 3: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/3.jpg)
Web cloaking
SearchEffective for Search Engine Optimization
AdsEffective to infringe policies
MalwareEffective to evade security crawlers
![Page 4: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/4.jpg)
Responsive design vs cloaking
This is not cloaking.
![Page 5: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/5.jpg)
Responsive design vs cloaking
404
This is cloaking.
![Page 6: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/6.jpg)
Keep up with arms race
Identify trends
Explore alternatives
Research goals
![Page 7: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/7.jpg)
Blackmarket Investigation
Acquired
Top 10Cloaking software samples
Can’t go wrong withCloaky McCloakyFace.
I swear byNowYouSeeMe!
![Page 8: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/8.jpg)
HTTP reverse proxy
$3500+ cloaking software
Network Browser Browsing contextDecision based on:
![Page 9: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/9.jpg)
$3500+ cloaking software
Admin interfaceConfigures
Gen
erat
es
HTTP reverse proxy
![Page 10: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/10.jpg)
Input keywords => http://money.site
Features
● Find similar sites through SERPs
● Content/Template spinning
● Drip-feeding
Added services
● Plagiarism detection
● SERP ranking
Admin interface
![Page 11: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/11.jpg)
Cloaking techniques
![Page 12: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/12.jpg)
Technique: referer-based cloaking
GET /Referer: ...tiffany+cheap...
GET /Referer: blank
GET /Referer: ...tiffany...
![Page 13: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/13.jpg)
Technique: IP blacklisting
Blacklisted IPs51m
Subnets983
Security companies30
Hacking collectives2
Proxy networks3
Entities: companies, universities, registrars
122
![Page 14: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/14.jpg)
Crowdsourced blacklist
Blacklisted IPs
50k
Subscription
$350+
Honeypot
![Page 15: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/15.jpg)
Technique: rDNS cloaking
66.249.66.1
Host 66.249.66.1?
crawl.googlebot.com.Google (.*1e100.*, .*google.*)
MicrosoftYahooYandexBaiduAskRamblerDirectHitTheoma
![Page 16: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/16.jpg)
Technique: browsing pattern cloaking
GET /clickedGET /
Set-Cookie: now()
![Page 17: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/17.jpg)
Geolocation:country, city, carrier level.
Flash/JS support & fingerprints
User-Agent
More techniques
JS
![Page 18: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/18.jpg)
Prevalence and dominant techniques
404Is this cloaking?
How do they cloak?
![Page 19: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/19.jpg)
Browser farm
User-Agent: GoogleBotReferer: blankGoogle IP
Pretend Google botsUser-Agent: ChromeReferer: blank, or simpleCloud provider IPs
Simple honey clientsUser-Agent: ChromeReferer: context-awareResidential and mobile IPs
Realistic honey clients
wget wget
I’m real!
![Page 20: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/20.jpg)
Features
Syntactic Content similarity Screenshot similarity
Semantic Topic similarity Screenshot topic similarity
HTML Image
![Page 21: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/21.jpg)
95k labeled samples75k legitimate websites (Alexa) + 20k cloaked storefronts
Classification
False positive rate.9%
True positive rate82%
![Page 22: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/22.jpg)
Prevalence
Cloaking pages in Google Search, for luxury storefronts keywords.
11.7%Cloaking pages in Google AdWords, for health and software ads.
4.9%
![Page 23: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/23.jpg)
Traditional techniques: only IP, Referer, and User-Agent
Search: 1 out of 5Ads: 1 out of 4
![Page 24: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/24.jpg)
Search: HalfAds: 1 out of 4
Current techniques: JavaScript support
![Page 25: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/25.jpg)
Current techniques: wait for click
Search: 1 out of 10
Ads: 1 out of 5
![Page 26: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/26.jpg)
Delivery: same-page cloaking
Uncloaked Cloaked
Search: 1 out of 5Ads: 2 out of 3
![Page 27: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/27.jpg)
404
Delivery: 40x/50x errors to bots
Search: 1 out of 7Ads: 1 out of 8
![Page 28: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/28.jpg)
Future: client-side detection
Search/Ads links add a parameter with the topics
found by the bot. Check that the page matches the same topics.
![Page 29: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/29.jpg)
Takeaways
Prevalence5% of ads and 12% of search results for cloaking-prone keywords cloak.
TechniquesIP/User-Agent/Referer only gets ⅕ of cloaking.
Moving forwardClient side, semantic features needed for hard cases.
![Page 30: Cloak of Visibility: Detecting When Machines Browse A Different Web - IEEE Security and Privacy 2016](https://reader031.vdocuments.net/reader031/viewer/2022030305/5872850e1a28abc7068b6f83/html5/thumbnails/30.jpg)
Thank you!Luca Invernizzi [email protected]