9 17-16 - when recommendation systems go bad - rec sys
TRANSCRIPT
Data Science impacts
lives
Ads you seeApps you downloadFriend’s Activity/Facebook feedNews you’re exposed toIf a product is availableIf you can get a ridePrice you pay for thingsAdmittance into collegeJob openings you find/getIf you can get a loan
Recommendation Systems: Collaborative Filtering
Completely NormalBook Recommendations For Asimov’s Foundation:
Foundation and EmpireSecond FoundationPrelude to FoundationForward the FoundationFoundation’s EdgeFoundation and Earth
https://en.wikipedia.org/wiki/File:Isaac_Asimov_on_Throne.png
Completely Normal Search Engine Results
Query: Obama birth place1.Honolulu, HI2.Wikipedia: Obama birth
place conspiracy theories3.Birth Certificate:
WhiteHouse.gov
Query: Obama birth certificate fake1. 10 Facts that show Obama
Birth Certificate is FAKE2. OBAMA’S LAWYERS ADMIT TO
FAKING BIRTH CERTIFICATE3. Video: Proof Obama Birth
Certificate is Fake
You just wanted a kitchen scale, now the internet thinks you’re a drug dealerYou purchased: Mini digital pocket kitchen scale!
You probably want: 100 pack subtle resealable baggies250 perfectly legal ‘cigarette’ paper bookletsTotally reasonable number of small plastic bags1000 ‘cigar’ wraps
Completely normal product results
https://commons.wikimedia.org/wiki/File:Cigarette_rolling_papers_%287%29.JPG
Ego
Member/customer/user firstFocus on building the best product,
not on being the most clever data scientist
Much harder to spin a positive user story than a story about how smart you are
“Google searches
involving black-sounding
names are more likely to serve up ads
suggestive of a criminal record”
“Black-sounding” names 25% more likely to be served ad suggesting criminal record
“NAME arrested?” Ads suggest queried name is associated with an arrest and warrants a background check
Ads for services related to recovering from arrest/incarceration
https://www.technologyreview.com/s/510646/racism-is-poisoning-online-ad-delivery-says-harvard-professor/
Ethics
We have accepted that Machine Learning can seem creepy, how do we prevent it from becoming immoral?
We have an ethical obligation to not teach machines to be prejudiced.
Data Ethics
Awareness
Talk about it!
Identify groups that could be
negatively impacted by your
work
Make a choiceTake a stand
Interpretable Models
For simple problems, simple solutions are often worth a small concession in performance
Inspectable models make it easier to debug problems in data collection, feature engineering etc.
Only include features that work the way you want
Don’t include feature interactions that you don’t want
Logistic Regression
StraightDistanceFeature(-0.0311f),ChapterZipScore(0.0250f),RsvpCountFeature(0.0207f),AgeUnmatchFeature(-1.5876f),GenderUnmatchFeature(-3.0459f),StateMatchFeature(0.4931f),CountryMatchFeature(0.5735f),FacebookFriendsFeature(1.9617f),SecondDegreeFacebookFriendsFeature(0.1594f),ApproxAgeUnmatchFeature(-0.2986f),SensitiveUnmatchFeature(-0.1937f),KeywordTopicScoreFeatureNoSuppressed(4.2432f),TopicScoreBucketFeatureNoSuppressed(1.4469f,0.257f,10f),TopicScoreBucketFeatureSuppressed(0.2595f,0.099f,10f),ExtendedTopicsBucketFeatureNoSuppressed(1.6203f,1.091f,10f),ChapterRelatedTopicsBucketFeatureNoSuppressed(0.1702f,0.252f,0.641f),ChapterRelatedTopicsBucketFeatureNoSuppressed(0.4983f,0.641f,10f),DoneChapterTopicsFeatureNoSuppressed(3.3367f)
Feature Engineering and Interactions
● Good Feature:○ Join! You’re interested in Tech x Meetup is about Tech
● Good Feature: ○ Don’t join! Group is intended only for Women x You are a Man
● Bad Feature:○ Don’t join! Group is mostly Men x You are a Woman
● Horrible Feature:○ Don’t join! Meetup is about Tech x You are a Woman
Meetup is not interested in propagating gender stereotypes
Ensemble Models and
Data segregation
Ensemble Models: Combine outputs of several classifiers for increased accuracy
If you have features that are useful but you’re worried about interaction (and your model does it automatically) use ensemble modeling to restrict the features to separate models.
Ensemble Model, Data Segregation
Data:*InterestsSearchesFriendsLocation
Data:*GenderFriendsLocation
Data:Model1 PredictionModel2 Prediction
Model1 Prediction
Model2 Prediction
Final Prediction
https://commons.wikimedia.org/wiki/File:Animation2.gif
“Women less likely to be shown ads
for high-paid jobs on Google, study
shows”
Carnegie Mellon ‘AdFisher’ projectFake profiles, track adsCareer coaching for “200k+”
Executive jobs AdMale group: 1852 impressionsFemale group: 318
https://www.theguardian.com/technology/2015/jul/08/women-less-likely-ads-high-paid-jobs-google-study
Diversity Controlled Testing
Same technique can work to find bias in your own models!Generate Test Data
Randomize sensitive feature in real data setRun Model
Evaluate for unacceptable biased treatment
What about automating this?
Fair Test algorithm - Florian TramèrStill needs you to decide what features are badHumanity required
“‘Holy F**K’: When Facial Recognition Algorithms Go Wrong”
Google Photos ServiceAutomatic image taggingTagged African American couple as
“gorillas”
http://www.fastcompany.com/3048093/fast-feed/holy-fk-when-facial-recognition-algorithms-go-wrong
● Twitter bot● “Garbage in,
garbage out”● Responsibility?
“In the span of 15 hours Tay referred to feminism as a "cult" and a "cancer," as well as noting "gender equality = feminism" and "i love feminism now." Tweeting "Bruce Jenner" at the bot got similar mixed response, ranging from "caitlyn jenner is a hero & is a stunning, beautiful woman!" to the transphobic "caitlyn jenner isn't a real woman yet she won woman of the year?"”
Tay.aiTwitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day
http://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist
Diverse test data
Outliers can matter
The real world is messy
Some people will mess with you
Some people look/act different than
you
Defense
DiversityDesign
“ There’s software used
across the country to
predict future criminals. And
it’s biased against blacks.”
Algorithm for predicting repeat offenders used in how harsh the sentence for a crime should be
Proprietary model, undisclosed algorithm, features etc.
Claims to not use race as a factorNearly twice as likely to falsely
label black defendants as likely future criminals
More likely to mis-label whites as low risk
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing