9 17-16 - when recommendation systems go bad - rec sys

36
When Recommendation Systems Go Bad Evan Estola RecSys 2016 9/17/16

Upload: evan-estola

Post on 05-Jan-2017

692 views

Category:

Technology


0 download

TRANSCRIPT

When Recommendation Systems Go Bad

Evan EstolaRecSys 2016

9/17/16

About Me

Evan EstolaLead Machine Learning Engineer @ [email protected]@estola

We want a world full of real, local community.Women’s Veterans Meetup, San Antonio, TX

Why Recs at

Meetup are Hard

Cold StartSparsityLies

Schenectady

Data Science impacts

lives

Ads you seeApps you downloadFriend’s Activity/Facebook feedNews you’re exposed toIf a product is availableIf you can get a ridePrice you pay for thingsAdmittance into collegeJob openings you find/getIf you can get a loan

Recommendation Systems: Collaborative Filtering

Completely NormalBook Recommendations For Asimov’s Foundation:

Foundation and EmpireSecond FoundationPrelude to FoundationForward the FoundationFoundation’s EdgeFoundation and Earth

https://en.wikipedia.org/wiki/File:Isaac_Asimov_on_Throne.png

Completely Normal Search Engine Results

Query: Obama birth place1.Honolulu, HI2.Wikipedia: Obama birth

place conspiracy theories3.Birth Certificate:

WhiteHouse.gov

Query: Obama birth certificate fake1. 10 Facts that show Obama

Birth Certificate is FAKE2. OBAMA’S LAWYERS ADMIT TO

FAKING BIRTH CERTIFICATE3. Video: Proof Obama Birth

Certificate is Fake

You just wanted a kitchen scale, now the internet thinks you’re a drug dealerYou purchased: Mini digital pocket kitchen scale!

You probably want: 100 pack subtle resealable baggies250 perfectly legal ‘cigarette’ paper bookletsTotally reasonable number of small plastic bags1000 ‘cigar’ wraps

Completely normal product results

https://commons.wikimedia.org/wiki/File:Cigarette_rolling_papers_%287%29.JPG

Orbitz

https://en.wikipedia.org/wiki/File:CitigroupCenterChicago.jpg

Ego

Member/customer/user firstFocus on building the best product,

not on being the most clever data scientist

Much harder to spin a positive user story than a story about how smart you are

“Google searches

involving black-sounding

names are more likely to serve up ads

suggestive of a criminal record”

“Black-sounding” names 25% more likely to be served ad suggesting criminal record

“NAME arrested?” Ads suggest queried name is associated with an arrest and warrants a background check

Ads for services related to recovering from arrest/incarceration

https://www.technologyreview.com/s/510646/racism-is-poisoning-online-ad-delivery-says-harvard-professor/

Ethics

We have accepted that Machine Learning can seem creepy, how do we prevent it from becoming immoral?

We have an ethical obligation to not teach machines to be prejudiced.

Data Ethics

Awareness

Talk about it!

Identify groups that could be

negatively impacted by your

work

Make a choiceTake a stand

Interpretable Models

For simple problems, simple solutions are often worth a small concession in performance

Inspectable models make it easier to debug problems in data collection, feature engineering etc.

Only include features that work the way you want

Don’t include feature interactions that you don’t want

Logistic Regression

StraightDistanceFeature(-0.0311f),ChapterZipScore(0.0250f),RsvpCountFeature(0.0207f),AgeUnmatchFeature(-1.5876f),GenderUnmatchFeature(-3.0459f),StateMatchFeature(0.4931f),CountryMatchFeature(0.5735f),FacebookFriendsFeature(1.9617f),SecondDegreeFacebookFriendsFeature(0.1594f),ApproxAgeUnmatchFeature(-0.2986f),SensitiveUnmatchFeature(-0.1937f),KeywordTopicScoreFeatureNoSuppressed(4.2432f),TopicScoreBucketFeatureNoSuppressed(1.4469f,0.257f,10f),TopicScoreBucketFeatureSuppressed(0.2595f,0.099f,10f),ExtendedTopicsBucketFeatureNoSuppressed(1.6203f,1.091f,10f),ChapterRelatedTopicsBucketFeatureNoSuppressed(0.1702f,0.252f,0.641f),ChapterRelatedTopicsBucketFeatureNoSuppressed(0.4983f,0.641f,10f),DoneChapterTopicsFeatureNoSuppressed(3.3367f)

Feature Engineering and Interactions

● Good Feature:○ Join! You’re interested in Tech x Meetup is about Tech

● Good Feature: ○ Don’t join! Group is intended only for Women x You are a Man

● Bad Feature:○ Don’t join! Group is mostly Men x You are a Woman

● Horrible Feature:○ Don’t join! Meetup is about Tech x You are a Woman

Meetup is not interested in propagating gender stereotypes

Ensemble Models and

Data segregation

Ensemble Models: Combine outputs of several classifiers for increased accuracy

If you have features that are useful but you’re worried about interaction (and your model does it automatically) use ensemble modeling to restrict the features to separate models.

Ensemble Model, Data Segregation

Data:*InterestsSearchesFriendsLocation

Data:*GenderFriendsLocation

Data:Model1 PredictionModel2 Prediction

Model1 Prediction

Model2 Prediction

Final Prediction

https://commons.wikimedia.org/wiki/File:Animation2.gif

“Women less likely to be shown ads

for high-paid jobs on Google, study

shows”

Carnegie Mellon ‘AdFisher’ projectFake profiles, track adsCareer coaching for “200k+”

Executive jobs AdMale group: 1852 impressionsFemale group: 318

https://www.theguardian.com/technology/2015/jul/08/women-less-likely-ads-high-paid-jobs-google-study

Diversity Controlled Testing

Same technique can work to find bias in your own models!Generate Test Data

Randomize sensitive feature in real data setRun Model

Evaluate for unacceptable biased treatment

What about automating this?

Fair Test algorithm - Florian TramèrStill needs you to decide what features are badHumanity required

“‘Holy F**K’: When Facial Recognition Algorithms Go Wrong”

Google Photos ServiceAutomatic image taggingTagged African American couple as

“gorillas”

http://www.fastcompany.com/3048093/fast-feed/holy-fk-when-facial-recognition-algorithms-go-wrong

● Twitter bot● “Garbage in,

garbage out”● Responsibility?

“In the span of 15 hours Tay referred to feminism as a "cult" and a "cancer," as well as noting "gender equality = feminism" and "i love feminism now." Tweeting "Bruce Jenner" at the bot got similar mixed response, ranging from "caitlyn jenner is a hero & is a stunning, beautiful woman!" to the transphobic "caitlyn jenner isn't a real woman yet she won woman of the year?"”

Tay.aiTwitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day

http://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist

Diverse test data

Outliers can matter

The real world is messy

Some people will mess with you

Some people look/act different than

you

Defense

DiversityDesign

“ There’s software used

across the country to

predict future criminals. And

it’s biased against blacks.”

Algorithm for predicting repeat offenders used in how harsh the sentence for a crime should be

Proprietary model, undisclosed algorithm, features etc.

Claims to not use race as a factorNearly twice as likely to falsely

label black defendants as likely future criminals

More likely to mis-label whites as low risk

https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

You know racist computers are a bad idea

Don’t let your company invent racist computers

@estola