targeted, not tracked: client-side solutions for privacy-friendly behavioral advertising janice tsai...

31
TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Upload: agatha-daniel

Post on 18-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FORPRIVACY-FRIENDLY BEHAVIORAL ADVERTISING

Janice Tsai

Misha Bilenko

Matt Richardson

Page 2: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Anonymous User Sees This Ad

Page 3: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Known User Sees A Different Ad

Page 4: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Personalized Advertising Today• User is tracked: history of activity is stored

• On ad platform’s server and/or cookie

• History is processed into profile• Reduced representation for quick lookup• Can also be communicated or sold across parties

• Profile is used for ad targeting

• Total targeting revenue expected $2.6B by 2014 (eMarketer 2011)

• Supported by all major ad platforms

Page 5: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Talk Outline

• Client-side vs. server-side profiles

• Client-only Profiles (CoP): balancing privacy and personalization

• Experiments: client- vs. server-side revenue difference

Page 6: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Personalized Advertising is Ubiquitous• Driven by economics

• Publishers, platforms: CPM rates 2.7x higher [Beales ‘10]• Advertisers: 6x gain in CTR [Yao et al. ‘08]

• What about users? • “It’s a little creepy, especially if you don’t know what’s going on” [NYT ‘11]• What’s going on is complex and misunderstood [McDonald ’10-11]• Ad industry: self-regulation, users can opt out via • Browsers: Do Not Track (FF, IE, Safari), KeepMyOptOuts (Chrome)• Privacy advocates: self-regulation is insufficient• W3C Tracking Protection Working Group• Legislation: multiple bills/hearings in US; European e-Privacy directive

Page 7: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Personalized Advertising Mechanics

• User information drives market efficiency• Users have no knowledge/control of their information• First vs. third-party distinction is increasingly non-trivial

Publisher

Ad PlatformUser

Ad platform

Ad platform

…Advertiser

Aggregator

Page 8: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson
Page 9: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Server-side User Profiles in Advertising

Log storeProfile Update

(t)

Profile store

Personalized response

computation

Ht

p(t+1)

p(t)

Server

(query or url)

Page 10: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Server-side User Profiles in Advertising

Log storeProfile Update

(t)

Profile store

Personalized response

computation

Ht

p(t+1)

p(t)

Server

(query or url)

(ad)

Page 11: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Server-side User Profiles in Advertising

Log storeProfile Update

(t)

Profile store

Personalized response

computation

Ht

p(t+1)

p(t)

Server

(query or url)

(ad)

Page 12: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Problem: No User Control over Data• Users do not know what is stored, where and why

• Use, retention, sharing

• Users cannot edit or delete their behavioral data• Deleting cookies insufficient: re-identification, LBOs, local storage• Opting out ≠ having your data purged

• Most users tracking invasive when asked [McDonald-Cranor’10]• But don’t do much about it: Do Not Track adoption in Firefox: 4-6%

Page 13: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Current “Do Not Track” Proposals• Provide a mechanism for users to prevent being tracked• Existing browser implementations

• HTTP headers, opt-out cookies• Browser contacts server but notifies it that user does not want to be tracked.• User must trust service providers

• Domain blocking / TPL lists• Browser doesn’t send request to certain domains

• Tracking vs. targeting: collection vs. usage• “All or nothing” approach: privacy = no targeting• Undesirables extremes: inefficiency vs. loss of revenue

Page 14: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Client-Side Tracking• Tracking is performed solely on client machine

• User retains control, targeting is still possible• User can delete or edit profile• Services don’t retain user history

• No back-end sharing of user data between companies• Avoid issues around retention policies, deleting all copies, etc.

• Studies indicate users care more about being tracked than about being targeted

Page 15: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Existing Plugin-Based Approaches• Privad, Adnostic, RePRIV• User installs client plugin which collects user data and

communicates with ad network• Difficulties

• Requires user to install plugin• Requires significant changes to existing ad serving infrastructure• Hard to manage click fraud, ad budgets• Bandwidth (e.g., 10x ads sent to client)• Targeting algorithms baked into plugin may slow innovation• Targeting on client = less information than targeting on server

Page 16: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Alternative: Client-Only Profiles (CoP)• Profile stored in cookie on client machine

• Browser sends profile to server upon page request

• Server returns page and updated profile in cookie

• Server does not log user activity

Page 17: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Client-only Profiles

Server

Profile Update

Context c(t)

Local storage

Personalized response

computation

Response

Profile p(t)

Profile p(t+1)

Client

Page 18: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Client-only Profiles

Server

Profile Update

Context c(t)

Local storage

Personalized response

computation

Response

Profile p(t)

Profile p(t+1)

Client

Page 19: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Client-only Profiles+ No plugins (AdNostic, RePRIV, Privad: users install plugins)

+ No major changes to serving infrastructure

+ Targeting server-side (advanced features/algorithms)

+ Profile update server-side (advanced features/algorithms)

- Must trust ad platform to comply with policy and not retain• Debatable proposition for security community…• …but Do Not Track already makes the same assumption

What will it cost compared to server-side tracking?

Page 20: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Comparison of Tracking Approaches

Page 21: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Incremental Profile Updates: Task• How much does incremental update hurt?

• Compare to profiles constructed on server from full history

• May depend on the task (personalizing ads, content, search results)

• Representative task: predicting future ad clicks• Discriminates long-term user interests• Can be used for ad selection, ranking, CTR prediction, auction

• Bid Increments• Advertiser specifies an increment to their bid if the user has the

keyword in their profile

Page 22: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Incremental Profile Updates: Method[Bilenko and Richardson KDD-2011]

• Algorithm based on machine learning

• Features based on behavior frequency/recency, context, etc.

• ML function predicts p(click|keyword) using these features

• Select top-k keywords for profile• Keyword value is incremental utility of ads not covered in profile so far• Leads to a submodular optimization problem• Solved by efficient, accurate approximate algorithm

Page 23: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Incremental Updates: Study• Two months of activity on Bing search engine• 2.4 million users (randomly sampled from total population)• Train predictor using first 6 weeks• Cookie contains

• Profile: Top-k keywords by predicted value• Cache: LRU policy

• Metric• Fraction of future clicks in profile (proportional to revenue gain)

Page 24: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Incremental Updates: Results

• Retains 97-99% of gain vs. server-side tracking• Requires only 20-50 keywords in profile (50 in cache)

10 20 30 40 5050%

55%

60%

65%

70%

75%

80%

85%

90%

95%

100%

Profile size=10Profile size=20Profile size=30Profile size=40Profile size=50

Cache size

% o

f s

erv

er-

sid

e u

tili

ty

Page 25: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Conclusions• Client-side tracking balances privacy and market efficiency

• Possible approach: CoP, which• Ensures user control over tracking• Requires insignificant change to existing infrastructure• Retains 97+% of revenue gains by ad targeting

• Should Do Not Track distinguish client-and server-side tracking?• 1st vs. 3rd party are increasingly difficult to differentiate

Page 26: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

THANKS!

Page 27: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Backup Slide• If I have to trust the server anyway, why not trust it to store

my profile as well?• Trusting not to store is a lower bar than trusting to properly handle

profile

• Storing profile on server = Trusting any team with access to your profile to:• Know the policies• Correctly implement things like opt-out, retention, publication.• Either never copy your history, or ensure your edits/deletions are

propagated through to all copies.• Not to share it with any other team that might not know these things

• Storing profile on client = Trusting just the team that receives the profile to use it and throw it away.

Page 28: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

1st vs. 3rd party• Distinction is getting increasingly muddled

• 1st party data collection is becoming pervasive• 3rd party collection can be tightly controlled by advertiser.

Page 29: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

Regulatory Interest in Behavioral Advertising

• United States• Federal Trade Commission has proposed a regulatory framework

calling for Do Not Track solutions• Legislation calls for Do Not Track solutions

• US Senate, US House of Representatives, California Legislature

• Europe• Notice and Consent prior to depositing cookies

Page 30: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

30

Do Not Track Solutions

HTTP Header- URLs are appended with “X-Do-Not-Track”Blocking Traffic- URI Blacklist- Blocks traffic = Blocks AdsOpt-Out Cookie- Cookie signals Opt-Out intent- Blocks traffic = Blocks Ads

Page 31: TARGETED, NOT TRACKED: CLIENT-SIDE SOLUTIONS FOR PRIVACY-FRIENDLY BEHAVIORAL ADVERTISING Janice Tsai Misha Bilenko Matt Richardson

31

Do Not Track Solutions

DNT Solution Apple Safari Google Chrome

Microsoft IE Mozilla Firefox

Blocking Traffic

Opt-Out Cookie

HTTP Header

Do Not Track solutions are built into each browser with the exception of Google Chrome where the Opt-Out cookies are a part of a browser extension.