fighting fire with fire: crowdsourcing security threats and solutions on the social web gang wang,...

Fighting Fire With Fire:Crowdsourcing Security Threats and Solutions on the Social Web

Gang Wang, Christo Wilson, Manish Mohanlal, Ben Y. ZhaoComputer Science Department, UC Santa Barbara.

[email protected]

A Little Bit About Me

3nd Year PhD @ UCSB Intern at MSR Redmond 2011 Intern at LinkedIn (Security

Team) 2012

2

Research Interests: Security and Privacy Online Social Networks Crowdsourcing

Data Driven Analysis and Modling

3

Recap: Threats on the Social Web Social spam is a serious problem

10% of wall posts with URLs on Facebook are spam

70% phishing Sybils underlie many attacks on Online Social

Networks Spam, spear phishing, malware distribution Sybils blend completely into the social graph

Existing countermeasures are ineffective Blacklists only catch 28% of spam Sybil detectors from the literature do not work

4

Sybil Accounts on Facebook

In-house estimates Early 2012: 54 million August 2012: 83 million 8.7% of the user base

Fake likes VirtualBagel: useless site, 3,000 likes

in 1 week 75% from Cairo, age 13-17• Sybils attacks in large scale

• Advertisers are fleeing Facebook

5

Sybil Accounts on Twitter

92% of Newt Gingritch’s followers are Sybils Russian political protests on Twitter

25,000 Sybils sent 440,000 tweets 1 million Sybils controlled overall

Follo

wers

4,000 new followers/d

ay100,000 new followers in 1

day

• Twitter is vital infrastructure• Sybils usurping Twitter for political ends

6

Talk Outline

1. Malicious crowdsourcing sites – crowdturfing [WWW’12]

Spam and Sybils generated by real people Huge threat in China Growing threat in the US

2. Crowdsourced Sybil detection [NDSS’13] If attackers can do it, why not defenders? Can humans detect Sybils? Is this cost effective? Design a crowdsourced Sybil detection system

User Study

7 Outline Intro Crowdturfing

Crowdsourcing Overview What is Crowdturfing How bad is it? Crowdturfing in the US

Crowdsourced Sybil Detection Conclusion

8

We tend to think of spam as “low quality” What about high quality spam and Sybils? Open questions

What is the scope of this problem? Generated manually or mechanically? What are the economics?

High Quality Sybils and Spam

Gang WangMaxGentleman is the bestest male enhancement system avalable. http://cid-ce6ec5.space.live.com/

FAKEStock Photographs

http://cid-ce6ec5.space.live.com/

9

Black Market Crowdsourcing

Amazon’s Mechanical Turk

Admins remove spammy jobs Black market crowdsourcing websites

Spam and fake accounts, generated by real people Major force in China, expanding in the US and

India

Crowdturfing = Crowdsourcing + Astroturfing

11

Crowdturfing Workflow

Customers

Initiate campaigns

May be legitimate businesses

Agents Manage

campaign and workers

Verify completed tasks

Workers Complete

tasks for money

Control Sybils on other websites

Campaign

Tasks

Reports

12

Crowdturfing in China

Site

ActiveSince

TotalCampaigns

Workers

Reports

$ forWorkers

$ forSite

Zhubajie

Nov. 2006 76K 169K 6.3M $2.4M $595K

1

10

10

100

1000

10000

100000

1000000

Site Growth Over Time

Cam

paig

ns p

er

Mo

nth

Do

llars

per

Mo

nth

Jan. 08 Jan. 09 Jan. 10 Jan. 11

Zhubajie

Sandaha

Campaigns

$

Campaigns

$

13

Spreading Spam on Weibo

100 1000 10000 100000 1000000 100000000

10

20

30

40

50

60

70

80

90

100

Approximate Audience Size per Campaign

CD

F

50% of campaigns reach

>100000 users8% reach>1 million

users• Campaigns reach huge audiences• How effective are these campaigns?

14

Travel agency reported sales statistics 2 sales/month before our campaign 11 sales within 24 hours after our campaign Each trip sells for $1500!

Initiate our own campaigns as a customer 4 benign ad campaigns promoting real e-

commerce sites All clicks route through our measurement

server

How Effective is Crowdturfing?

Campaign About Targ

etCos

tTask

sRepor

tsClicks

Cost Per

Click

Vacation Advertise for a discount vacation through a

travel agent

Weibo

$15 100

108 28 $0.21

QQ 118 187 $0.09

Forums 123 3 $0.90

Web Display Ads CPC =

$0.01

15

Crowdturfing in America

Other studies support these findings Freelancer

28% spam jobs Bulk OSN accounts, likes, spam Connections to botnet operators

US Sites % Crowdturfing

Legit

Mechanical Turk 12%Bl

ack Market

MinuteWorkers

70%

MyEasyTasks 83%

Microworkers 89%

ShortTasks 95%

Poultry Markets $20 for 1000

followers Ponzi scheme

16

Takeaways

Identified a new threat: Crowdturfing Growing exponentially in size and revenue in

China $1 million per month on just one site Cost effective: $0.21 per click

Starting to grow in US and other countries Mechanical Turk, Freelancer Twitter Follower Markets

Huge problem for existing security systems Little to no automation to detect Turing tests fail

17 Outline Intro Crowdturfing Crowdsourced Sybil Detection

Open Questions User Study Accuracy Analysis System Design

Conclusion

18

Crowdsourcing Sybil Defense

Defenders are losing the battle against OSN Sybils

Idea: build a crowdsourced Sybil detector Leverage human intelligence Scalable

Open Questions How accurate are users? What factors affect detection accuracy? Is crowdsourced Sybil detection cost effective?

19

User Study

Two groups of users Experts – CS professors, masters, and PhD students Turkers – crowdworkers from Mechanical Turk and

Zhubajie Three ground-truth datasets of full user profiles

Renren – given to us by Renren Inc. Facebook US and India

Crawled Legitimate profiles – 2-hops from our own profiles Suspicious profiles – stock profile images Banned suspicious profiles = Sybils

Stock Picture

Crowdturfing Site

20

Progress

Classifying Profiles

BrowsingProfiles

Screenshot of Profile(Links Cannot be

Clicked)

Real or fake?

Why?

Navigation Buttons

Testers may skip around and revisit

profiles

21

Experiment Overview

Dataset

# of Profiles

Test Group

# of Teste

rs

Profile per

TesterSybil Legit.

Renren 100 100

Chinese Expert

24 100

Chinese Turker

418 10

Facebook US

32 50US Expert 40 50

US Turker 299 12

Facebook India

50 49India Expert 20 100

India Turker 342 12

Crawled Data

Data from Renren

Fewer Experts

More Profiles per Experts

22

Individual Tester Accuracy

0 10 20 30 40 50 60 70 80 90 1000

20

40

60

80

100Chinese Turker

US Turker

US Expert

Accuracy per Tester (%)

CD

F (

%)

Not so

good :(

• Experts prove that humans can be accurate• Turkers need extra help…

Awesome!80% of experts

have >90% accuracy!

23

Accuracy of the Crowd

Treat each classification by each tester as a vote

Majority makes final decisionDataset Test Group

False Positives

False Negatives

RenrenChinese Expert 0% 3%

Chinese Turker 0% 63%

Facebook US

US Expert 0% 10%

US Turker 2% 19%

Facebook India

India Expert 0% 16%

India Turker 0% 50%

Almost Zero False Positives

Experts Perform

OkayTurkers Miss

Lots of Sybils

• False positive rates are excellent• Turkers need extra help against false negatives• What can be done to improve accuracy?

24

Eliminating Inaccurate Turkers

0 10 20 30 40 50 60 700

20

40

60

80

100ChinaIndiaUS

Turker Accuracy Threshold (%)

Fals

e N

eg

ati

ve R

ate

(%

) Dramatic Improvement

Most workers are >40% accurate

From 60% to 10% False Negatives• Only a subset of workers are removed (<50%)

• Getting rid of inaccurate turkers is a no-brainer

25

How Many Classifications Do You Need?

2 4 6 8 10 12 14 16 18 20 22 240

20

40

60

80

100

Classifications per Profile

Err

or

Rate

(%

)

China

India

US

False Negatives

False Positives

• Only need a 4-5 classifications to converge• Few classifications = less cost

26

How to turn our results into a system?

1. Scalability OSNs with millions of users

2. Performance Improve turker accuracy Reduce costs

3. Preserve user privacy when giving data to turkers

27

Social NetworkHeuristics

User ReportsSuspicious Profiles

All Turkers

OSN employee

TurkerSelection Accurate Turkers

Very Accurate Turkers

Sybils

System Architecture

Filtering Layer

Crowdsourcing Layer

Filter Out Inaccurate

Turkers

Maximize Usefulness of High Accuracy

Turkers

Rejected!

• Leverage Existing Techniques

• Help the System Scale

?

• Continuous Quality Control

• Locate Malicious Workers

Trace Driven Simulations

Simulate 2000 profiles Error rates drawn from survey

data Vary 4 parameters

28

Accurate Turkers

Very Accurate Turkers

Classifications

Classifications

Threshold

Controversial Range

Results• Average 6 classifications per profile• <1% false positives• <1% false negatives

2

5

90%

20-50%

Results++• Average 8 classifications per profile• <0.1% false positives• <0.1% false negatives

29

Estimating Cost

Estimated cost in a real-world social networks: Tuenti 12,000 profiles to verify daily 14 full-time employees Annual salary 30,000 EUR (~$20 per hour) $2240 per

day Crowdsourced Sybil Detection

20sec/profile, 8 hour day 50 turkers Facebook wage ($1 per hour) $400 per day

Cost with malicious turkers Estimate that 25% of turkers are malicious 63 turkers $1 per hour $504 per day

30

Takeaways

Humans can differentiate between real and fake profiles

Crowdsourced Sybil detection is feasible Designed a crowdsourced Sybil detection

system False positives and negatives <1% Resistant to infiltration by malicious workers Sensitive to user privacy Low cost

Augments existing security systems

31 Outline Intro Crowdturfing Crowdsourced Sybil Detection Conclusion

Summary of My Work Future Work

32

Key Contributions

1. Identified novel threat: crowdturfing End-to-end spam measurements from

customers to the web Insider knowledge of social spam

2. Novel defense: crowdsourced Sybil detection

User study proves feasibility of this approach Build an accurate, scalable system Possible deployment in real OSNs – LinkedIn

and RenRen

33

Ongoing Works

1. Twitter follower markets Locate customers who purchase bulk of Twitter

followers Study the un-follow dynamics of customers Develop systems to detect customers in the wild

2. Sybil detection using server-side click streams Build click models based on clickstream logs Extract click patterns of Sybil and normal users Develop systems to detect Sybil

34 Questions?

Thank you!

35

Potential Project Ideas

Malware distribution in cellular networks Identify malware related cellular network traffic Coordinated malware distribution campaigns Feature based detection

Advertising traffic analysis on mobile Apps Characterize ads traffic How effective for app-displayed ads to get click-

through? Are there malware delivered through ads?

36

Preserving User Privacy

Showing profiles to crowdworkers raises privacy issues

Solution: reveal profile information in context

!Crowdsourc

ed Evaluation

!Crowdsourc

ed Evaluation

Public Profile

Information

Friend-Only

Profile Informatio

nFriends

37

Clickstream Sybil Detection

Sybil Clickstream

Friend

Invite

Share

Browse

Profiles

Initial

Final

96%

9%

68%

15% 2%

27%64%

20% 55%31%

Photo

Initial

Final22% 3%

Share

Message

Friend

Invite

Browse

Profiles

9% 4%

5%

5%14%

9%

21%56%

56%

29%

86%87%

10%43%

14%

93%

Normal Clickstream

Clickstream detection of Sybils1. Absolute number of

clicks2. Time between clicks3. Page traversal order

Challenges Real-time Massive scalability Low-overhead

38

Are Workers Real People?

0 5 10 15 200

1

2

3

4

5

6

7

8

9

ZhubajieSandaha

Hours in the Day

% o

f R

ep

ort

s f

rom

W

ork

ers

Late Night/Early Morning Work Day/Evening

Lunch Dinn

erZBJ

SDH

39

Crowdsourced Sybil Detection

How to detect crowdturfed Sybils? Blur the line between real and fake Difficult to detect algorithmically

Anecdotal evidence that people can spot Sybils 75% of friend requests from Sybils are rejected Can people distinguish in real/fake general?

User studies: experts, turkers, undergrads What features give Sybils away? Are certain Sybils tougher than others?

Integration of human and machine intelligence

40

Survey Fatigue

US Experts US Turkers

0 3 6 90

20

40

60

80

100

0

20

40

60

80

100

Profile OrderTim

e p

er

Pro

file

(s)

Accu

racy (

%)

No fatigue

0 8 16 24 32 40 480

20

40

60

80

100

0

20

40

60

80

100

AccuracyProfile Order

Tim

e p

er

Pro

file

(s)

Accu

racy (

%)

Fatigue matters

All testers speed up over time

41

Sybil Profile Difficulty

0 5 10 15 20 25 30 350

102030405060708090

100

Turker

Sybil Profiles Ordered By Turker Accuracy

Avera

ge A

ccu

racy p

er

Syb

il (

%)

Experts perform well on most difficult Sybils

Really difficult profiles

• Some Sybils are more stealthy• Experts catch more tough Sybils than turkers

fighting fire with fire: crowdsourcing security threats and solutions on the social web gang wang,...

Documents

social spam

facebook slide

phishing sybils

sybils attacks

modling slide

high quality sybils

spam sybil detectors

spreading spam