algorithmic transparency with quantitative input influenceece734/lectures/lecture-2018... ·...
TRANSCRIPT
AlgorithmicTransparencywithQuantitativeInputInfluence
AnupamDattaCarnegieMellonUniversity
18734:FoundationsofPrivacy
MachineLearningSystemsareOpaque
CreditClassifier
Userdata Decisions
?
2
MachineLearningSystemsareOpaque
CreditClassifier
Userdata Decisions
3
AlgorithmicTransparency|DecisionswithExplanations[Datta,Sen,Zick IEEESymposiumonSecurityandPrivacy2016]
4
Howmuchinfluencedovariousinputs(features)haveonagivenclassifier’sdecisionaboutindividualsorgroups?
NegativeFactors:OccupationEducationLevel
PositiveFactors:CapitalGain
Age 27
Workclass Private
Education Preschool
MaritalStatus Married
Occupation Farming-Fishing
Relationshiptohouseholdincome OtherRelative
Race White
Gender Male
Capitalgain $41310
…..
Challenge|CorrelatedInputs
Example:Creditdecisions
Conclusion:Measuresofassociationnotinformative
Classifier(usesonlyincome)
Age
Decision
Income
5
Challenge |GeneralClassofTransparencyQueries
Whatinputshavethemostinfluenceoncreditdecisionsofwomen?Group
Disparity
Whichinputhadthemostinfluenceinmycreditdenial?Individual
Whatinputsinfluencemengettingmorepositiveoutcomesthanwomen?
Result| QuantitativeInputInfluence(QII)
7
Atechniqueformeasuringtheinfluenceofaninputofasystemonitsoutputs.
CausalIntervention Dealswithcorrelatedinputs
QuantityofInterest Supportsageneralclassoftransparencyqueries
KeyIdea1|CausalIntervention
Classifier(usesonlyincome)
Age
Decision
Income
8
21284463
$90K$100K$20K$10K
Replacefeaturewithrandomvaluesfromthepopulation,andexaminedistributionoveroutcomes.
QIIforIndividualOutcomes
𝑥" 𝑥# … 𝑥$ …
𝑐
𝑋 𝑥" 𝑥# … 𝑥$ …
𝑐
𝑋'$𝑈$
𝐏𝐫 𝑐 𝑋 = 1 𝑋 = 𝒙/01] 𝐏𝐫 𝑐 𝑋'$𝑈$ = 1 𝑋 = 𝒙/01]
𝑢$
𝑈$
Inputs:𝑖 ∈ 𝑁
Classifier
Outcome
9
CausalIntervention:Replacefeaturewithrandomvaluesfromthepopulation,andexaminedistributionoveroutcomes.
KeyIdea 2|QuantityofInterest
•Variousstatisticsofasystem:• Classificationoutcomeofanindividual
𝐏𝐫 𝑐 𝑋 = 𝑐(𝒙8) 𝑋 = 𝒙8]−𝐏𝐫 𝑐 𝑋'$𝑈$ = 𝑐(𝒙8) 𝑋 = 𝒙8]
• Classificationoutcomesofagroupofindividuals𝐏𝐫 𝑐 𝑋 = 1 𝑋isfemale]
−𝐏𝐫 𝑐 𝑋'$𝑈$ = 1 𝑋isfemale]
• Disparitybetweenclassificationoutcomesofgroups𝐏𝐫 𝑐 𝑋 = 1 𝑋ismale] − 𝐏𝐫 𝑐 𝑋 = 1 𝑋isfemale]
− 𝐏𝐫 𝑐 𝑋'$𝑈$ = 1 𝑋ismale]− 𝐏𝐫 𝑐 𝑋'$𝑈$ = 1 𝑋isfemale]
10
𝜄𝒬𝒜 𝑖 = 𝒬𝒜 𝑋 − 𝒬𝒜(𝑋'$𝑈$)
QII |Definition
TheQuantitativeInputInfluence(QII)ofaninput𝑖 onaquantityofinterest𝒬𝒜(𝑋) ofasystem𝒜 isthedifferenceinthequantityofinterestwhentheinputisreplacedwithrandomvalueviaanintervention.
11
Result| QuantitativeInputInfluence(QII)
12
Atechniqueformeasuringtheinfluenceofaninputofasystemonitsoutputs.
CausalIntervention Dealswithcorrelatedinputs
QuantityofInterest Supportsageneralclassoftransparencyqueries
Challenge|SingleInputshaveLowInfluence
Classifier
Age
Decision
Income
Onlyacceptold,high-incomeindividuals
13
Young
Low
NaïveApproach| SetQII
Replace𝑋E withindependentrandomvaluefromthejointdistributionofinputs𝑆 ⊆ 𝑁.𝜄 𝑆 = 𝒬 𝑋 − 𝒬(𝑋'E𝑈E)
𝑥" 𝑥# … 𝑥$ …
𝑐
𝑥" 𝑥# … 𝑥$ …
𝑐
𝑢$𝑢"
𝑆
14
MarginalQII
15
• Notallfeaturesareequallyimportantwithinaset.
•MarginalQII:Influenceofageandincomeoveronlyincome.𝜄 {age, income} − 𝜄 {income}
• Butage isapartofmanysets!
𝜄 {age} − 𝜄 {} 𝜄 {age,gender} − 𝜄 {gender}
𝜄 {age, gender, income} − 𝜄 {gender, income}𝜄 {age, job} − 𝜄 {job} 𝜄 {age, gender, job} − 𝜄 {gender, job}
𝜄 {age,gender, income, job} − 𝜄 {gender, income, job}
𝜄 {age, gender, job} − 𝜄 {gender, job}
NeedtoaggregateMarginalQIIacrossallsets
KeyIdea 3|SetQIIisaCooperativeGame
• Cooperativegame< 𝑁,𝑣 𝑆 >• 𝑁:setofagents• 𝑣(𝑆):valueofset𝑆
• Exampleofcooperativegame• Voting:𝑣(𝑆) =doesmotionpassifvotersin𝑆 voteyes?• Marginalcontributionofvoter𝑖:𝑚$ 𝑆 = 𝑣 𝑆 ∪ 𝑖 − 𝑣(𝑆)
• Oursetting• Inputfeaturesareagents• Influenceoffeatureset𝑆 ,i.e.SetQII𝜄 𝑆 is𝑣 𝑆• MarginalQIIis𝑚$ 𝑆 = 𝑣 𝑆 ∪ 𝑖 − 𝑣(𝑆)
16
ShapleyValue|AggregatingMarginalContributions
• [Shapley’53]TheShapleyValue
•Onlymeasurethatsatisfiesasetofaxioms• Theseaxiomsarereasonableinoursetting.
17
MarginalQIIoffeaturei wrt setS
Aggregateoverallsets
𝑚$ 𝑆 = 𝑣 𝑆 ∪ 𝑖 − 𝑣(𝑆)
ShapleyAxioms
• (Symmetry) Equalmarginalcontributionimpliesequalinfluence• Example:clonedfeatures
• (Dummy) Zeromarginalcontributionimplieszeroinfluence• Example:featuresnevertouchedbyMLmodel
• (Monotonicity)Consistentlyhighermarginalcontributionacrossgamesyieldshigherinfluence•Necessarytocomparefeatureinfluencescoresofindividuals
18
ShapleyAxioms|Math
• (Symmetry)Forall𝑖, 𝑗 and𝑆 ⊆ 𝑁\{𝑖, 𝑗},𝑣 𝑆 ∪ {𝑖} = 𝑣(𝑆 ∪ {𝑗}),implies𝜙$ 𝑣 =𝜙[ 𝑣
• (Dummy)Forall𝑖, 𝑆 ⊆ 𝑁,𝑣 𝑆 ∪ 𝑖 = 𝑣 𝑆 ,implies𝜙$ 𝑣 = 0
• (Monotonicity)Fortwogames𝑣",𝑣#,ifforall𝑆,𝑚$ 𝑆, 𝑣" ≥ 𝑚$(𝑆, 𝑣#),then𝜙$ 𝑣" ≥𝜙$ 𝑣# .
19
Experiments |TestApplications
Predictivepolicing using theNationalLongitudinal SurveyofYouth(NLSY)
• Features:Age,Gender,Race, Location,SmokingHistory,DrugHistory
• Classification:HistoryofArrests• ~8,000individuals
arrests
income
Incomeprediction usingabenchmarkcensusdataset• Features:Age,Gender,Relationship,Education,CapitalGains,Ethnicity
• Classification: Income>=50K• ~30,000individuals
20
ImplementedwithLogisticRegression,KernelSVM,DecisionTrees,DecisionForest
PersonalizedExplanation |Mr X
21
Age 23
Workclass Private
Education 11th
MaritalStatus Never married
Occupation Craftrepair
Relationshipto household income Child
Race Asian-PacIsland
Gender Male
Capitalgain $14344
Capitalloss $0
Workhoursperweek 40
Country Vietnam
Age 23
Workclass Private
Education 11th
MaritalStatus Never married
Occupation Craftrepair
Relationshipto household income Child
Race Asian-PacIsland
Gender Male
Capitalgain $14344
Capitalloss $0
Workhoursperweek 40
Country Vietnam
income
Canassuageconcernsofdiscrimination.
PersonalizedExplanation |Mr Y
Age 27
Workclass Private
Education Preschool
MaritalStatus Married
Occupation Farming-Fishing
Relationshipto household income OtherRelative
Race White
Gender Male
Capitalgain $41310
Capitalloss $0
Workhoursperweek 24
Country Mexico
income
22
Age 27
Workclass Private
Education Preschool
MaritalStatus Married
Occupation Farming-Fishing
Relationshipto household income OtherRelative
Race White
Gender Male
Capitalgain $41310
Capitalloss $0
Workhoursperweek 24
Country MexicoExplanationsofsuperficiallysimilarpeoplecanbe
different.
Experiments |Adultdataset
23
Experiments |Arrestdataset
24
RelatedWork |InfluenceMeasures
• RandomizedCausalIntervention• FeatureSelection:PermutationImportance[Breiman 2001]• [Kononenko andStrumbelj 2010]• ImportanceofCausalRelations[Janzing etal.2013]• Canbeviewedasinstancesof ourtransparencyschema• Donotconsidermarginalinfluenceorgeneralquantitiesofinterest
• AssociativeMeasures• QuantitativeInformationFlow:Appropriateforsecrecy• FairTest [Tramèr etal.2015]• IndirectInfluence[Adleretal.2016]• Measurepotentialindirectuse• Correlatedinputshidecausality
25
RelatedWork |InterpretableModels
• Interpretability-by-Design• Regularizationforsimplicity(Lasso)• BayesianRuleLists,FallingRuleLists[Letham etal.2015,WangandRudin 2015]• Possibleaccuracyloss,butshowspromisingperformancecharacteristics• Individualexplanationscanstillbeuseful
• Interpretableapproximatemodels• [Baehrens etal.2010]• LIME[Ribeiroetal.2016]• Canprovidericherexplanantions• Causalrelationtooriginalmodelcanbeunclear
26
Result| QuantitativeInputInfluence(QII)
27
Atechniqueformeasuringtheinfluenceofaninputofasystemonitsoutputs.
CooperativeGame Computesjointandaggregatemarginalinfluence
CausalIntervention Dealswithcorrelatedinputs
QuantityofInterest Supportsageneralclassoftransparencyqueries
Performance QIImeasurescanbeapproximatedefficiently