when recommendation systems go bad - machine eatable

When Recommendation

Systems Go Bad

Evan Estola3/31/17

About Me

● Evan Estola

● Staff Machine Learning Engineer, Data Team Lead @ Meetup

● evan@meetup.com

● @estola

Meetup

● Do more

● 270,000 Meetup Groups

● 30 Million Members

● 180 Countries

Why Recs at Meetup are Hard

● Cold Start

● Sparsity

● Lies

Recommendation Systems: Collaborative Filtering

Recommendation Systems: Rating Prediction

● Netflix prize

● How many stars would user X give movie Y

● Ineffective!

Recommendation Systems: Learning To Rank

● Treat Recommendations as a supervised ranking problem

● Easy mode:

○ Positive samples - joined a Meetup

○ Negative samples - didn’t join a Meetup

○ Logistic Regression, use output/confidence for ranking

You just wanted a kitchen scale, now Amazon thinks you’re a drug dealer

● “Black-sounding” names 25% more

likely to be served ad suggesting

criminal record

● Fake profiles, track ads

● Career coaching for “200k+”

Executive jobs Ad

● Male group: 1852 impressions

● Female group: 318

● Twitter bot● “Garbage in,

garbage out”● Responsibility?

“In the span of 15 hours Tay referred to feminism as a

"cult" and a "cancer," as well as noting "gender equality

= feminism" and "i love feminism now." Tweeting

"Bruce Jenner" at the bot got similar mixed response,

ranging from "caitlyn jenner is a hero & is a stunning,

beautiful woman!" to the transphobic "caitlyn jenner

isn't a real woman yet she won woman of the year?"”

Tay.ai

Know your data

● Outliers can matter

● The real world is messy

● Some people will mess with you

● Not everyone looks like you

○ Airbags

● More important than ever with

more impactful applications

○ Example: Medical data

Keep it simple

● Interpretable models

● Feature interactions

○ Using features against

someone in unintended ways

○ Work experience is good up

until a point?

○ Consequences of location?

○ Combining gender and

interests?

● When you must get fancy, combine

grokable models

Ensemble Model, Data Segregation

Data:*InterestsSearchesFriendsLocation

Data:*GenderFriendsLocation

Data:Model1 PredictionModel2 Prediction

Model1 Prediction

Model2 Prediction

Final Prediction

Diversity Controlled Testing

● CMU - AdFisher

○ Crawls ads with simulated user profiles

● Same technique can work to find bias in your own models!

○ Generate Test Data

■ Randomize sensitive feature in real data set

○ Run Model

■ Evaluate for unacceptable biased treatment

● Florian Tramèr

○ FairTest

https://research.google.com/bigpicture/attacking-discrimination-in-ml/

Human Problems

● Auto-ethics

○ Defining un-ethical features

○ Who decides to look for fairness in the first place?

By restricting or removing certain features aren’t you sacrificing performance? Isn’t it actually adding bias if you decide which features to put in or not?If the data shows that there is a relationship between X and Y, isn’t that your ground truth?

Isn’t that sub-optimal?

It’s always a human problem

● “All Models are wrong, but some are useful”

● Your model is already biased

Bad Features

● Not all features are ok!

○ ‘Time travelling’

■ Rating a movie => watched the movie

■ Cancer Surgery

Misguided Models

● “It’s difficult to make predictions, especially about the future”

○ Offline performance != Online performance

○ Predicting past behavior != Influencing behavior

○ Example: Clicks vs. buy behavior in ads

Asking the right questions

● Need a human

○ Choosing features

○ Choosing the right target variable

■ Value-added ML

“Computers are useless,

they can only give you

answers”

Bad Questions

● Questionable real-world applications

○ Screen job applications

○ Screen college applications

○ Predict salary

○ Predict recidivism

● Features?

○ Race

○ Gender

○ Age

Correlating features

● Name -> Gender

● Name -> Age

● Grad Year -> Age

● Zip -> Socioeconomic Class

● Zip -> Race

● Likes -> Age, Gender, Race, Sexual Orientation...

● Credit score, SAT score, College prestigiousness...

At your job...

Not everyone will have the same ethical values, but you don’t have to take

‘optimality’ as an argument against doing the right thing.

You know racist computers are a bad idea

Don’t let your company invent racist computers

@estola

when recommendation systems go bad - machine eatable

Data & Analytics

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiilllii...

report & recommendation i. recommendation

bad policy bad medicine

cardiovascular recommendation tables recommendation...

eatable media image? food in culture industry and the...

classification – naive...

bad apples, bad cases, and bad barrels meta-analytic...

9 17-16 - when recommendation systems go bad - rec sys

for all the senses - participepresents.com · christmas...

bad bad leroy brown

a bad time, to a bad day to a bad week

are bad acts bad because they're done by bad people, or are...

bad metric, bad!

it all started with hunting and gathering societies…....

queen anne's square: bad process, bad faith, bad design

repetition is bad, repetition is bad

su-eatable life reducing carbon emissions in the eu

bad deal, bad care

swiss agency for development and cooperation ·...

bad locations bad logistics