© 2014 Experian Information Solutions, Inc. All rights reserved. Experian and the marks used herein are service marks or registered trademarks of Experian Information Solutions, Inc.
Other product and company names mentioned herein are the trademarks of their respective owners. No part of this copyrighted work may be reproduced, modified, or distributed in
any form or manner without the prior written permission of Experian. Experian Public.
Testing credit scores for disparate impact on protected classes
Sarah Davies VantageScore®
Geoff Gunn Experian
#vision2014
2 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
What is
bias?
Q: Definition of bias:
A tendency to believe that
some people, ideas, etc.,
are better than others that
usually results in treating
some people unfairly
Source: Merriam-Webster Dictionary
3 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
0%
5%
10%
15%
20%
25%
White African American Hispanic
Best rating
Case study: Employee ratings
4 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
CFPB employee ratings
► Source: American Banker, March 6, 2014
Picked up by multiple other sources, including the Wall Street Journal
An internal review has been ordered
Actions taken remain to be seen
Case study: Employee ratings
5 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Testing VantageScore®
for bias
Sarah Davies
VantageScore® Solutions, LLC
6 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
The Equal Credit Opportunity Act (ECOA, implemented by Federal Reserve Board’s), Regulation B (12 CFR 202), prohibits discrimination in extending credit transactions for specific population classifications. Protected classes are:
► Race or ethnicity
► Religion
► National origin
► Sex
► Marital status
► Age (provided the applicant has the capacity to contract)
► The applicant’s receipt of income derived from any public assistance program
► The applicant’s exercise, in good faith, of any right under the Consumer Credit Protection Act
Equal Credit Opportunity Act, disparate impact and measurable bias
7 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Disparate
impact
“A disparate impact occurs
when a lender applies a
racially (or otherwise)
neutral policy or practice
equally to all credit
applicants but the policy or
practice disproportionately
excludes or burdens certain
persons on a prohibited
basis.”
Equal Credit Opportunity Act Disparate impact
8 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Measurable
bias
Does the VantageScore® 3.0
credit scoring model exhibit
any statistical bias in
relationship to any of the
protected classes…which, if
the model is used by a
lender, may lead to credit
decisions that result in
“disparate impact”
outcomes?
Equal Credit Opportunity Act Measurable bias
9 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Testing a credit score to determine whether it exhibits measurable bias
► Metric
► Data design
► Statistical test
Case study – unsecured lending
Case study – secured lending
Possible hidden bias???
Today…
10 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Credit scoring models, such as VantageScore® 3.0, are mathematical formulations built solely on consumer credit file information
► Payment history
► Age and types of credit
► Levels of utilization
► Credit limits
► Available credit
► Recent credit
No potentially discriminatory data such as ethnicity, employment, marital status, etc., are used
The credit score is a measure of risk defined as the probability that a consumer will default on a loan
► Default is defined as a loan becoming 90 or more days past due
Credit score model design
11 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
The credit score reflects statistical bias for a sub-population, if:
► The probability of default (PD) at a given score for the sub-population differs from the PD at the same score for all other sub-populations
► Examples:
● If the probability of default at a score of 700 is 5% for the Hispanic population while the probability of default at the same score is 4% for all other sub-populations, then the score reflects bias in favor of the Hispanic population
● If the PD at 700 for Hispanics is 3%, the score is biased against the Hispanic sub-population
Methodology for evaluating measurable bias
12 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Score represents the same level of risk for all sub-populations
Methodology – unbiased score
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
500
525
550
575
600
625
650
675
700
725
750
775
800
825
850
Pro
ba
bili
ty o
f D
efa
ult 9
0 D
ays o
r M
ore
Pa
st D
ue
Sub-population 1
Sub-population 2
Sub-population 3
13 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
PD rates at the same score, 575, vary from 18% and 28% reflecting bias
Methodology – biased score
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
500
525
550
575
600
625
650
675
700
725
750
775
800
825
850
Pro
bab
ility
of
Def
ault
(9
0 D
ays
or
Mo
re P
ast
Du
e) Sub-population 1
Sub-population 2
Sub-population 3
14 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
U.S. Census Bureau: American Community Survey can be used to provide “ethnicity weights” by ZIP Code™
► AOMC – proportion of African American households in ZIP Code™
► AOHC – proportion of Hispanic American household in ZIP Code™
► Non-AOMC/Non-AOHC – proportion neither African American nor Hispanic American in ZIP Code™
Methodology – test data design for ethnicity
AOMC
AOHC
Non-AOMC/Non-AOHC
15 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Example:
► ZIP Code™ with 30% African American, 20% Hispanic American and 50% non-African/Hispanic American
► Assign every consumer in the ZIP Code™ with the following weight:
● 30% AOMC, 20% AOHC and 50% non-AOMC/non-AOHC
Methodology – test data design for ethnicity
AOMC
AOHC
Non-AOMC/Non-AOHC
30%
50%
20%
16 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
A Chi-Square test for multiple probabilities provides an empirical method for determining differences between sub-population default probabilities
For each score band, the default proportions for each sub-population are compared against the whole population default proportion
► Statistically significant differences in proportions between a sub-population and the whole represent bias for the score band
► If there is bias in any single score band, there is bias in the score as a whole
Apply confidence intervals to account for sample size differences
Methodology – statistical test
Chi-Square Test
VantageScore® Start 701
3.0 interval End 725
Test Chi-Square 9.682
Critical value 11.408
Is test > critical value
(if yes, then bias)
No
17 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Case studies
Unsecured lending
18 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Sample of 1 million consumers with bankcard trades on their credit file were randomly selected from U.S. population
Ethnicity weighting was assigned based on the ZIP Code™ on credit file
► Sub-populations
● AOMC – African American
● AOHC – Hispanic
● Non-AOMC/Non-AOHC – neither African American or Hispanic
Evaluate default rate to score alignment graphically and statistically
Bankcard – do VantageScore® 3.0 scores exhibit bias toward certain ethnic populations?
19 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
0
0.1
0.2
0.3
0.4
0.5
0.6
500 525 550 575 600 625 650 675 700 725 750 775 800 825 839Probab
ilityofDefault(90DaysorMorePastDue)
VantageScore3.0Range
Non-AOMC/AOHC AOMC LowerAOMC UpperAOMC
AOHC LowerAOHC UpperAOHC Overall
All default curves appear to be well within upper and lower acceptable thresholds
Case study: Bankcard
20 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
500 525 550 575
90+DaysPastDue
NonAOMC/AOHC LowerAOMC AOMC UpperAOMC
LowerAOHC AOHC UpperAOHC Overall
Case study: Bankcard
Hispanic default rates are slightly lower than other populations, however all ethnic groups are well within confidence intervals
Scores reflect no measurable bias for different ethnic groups
21 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
No test statistic exceeds the critical value
VantageScore® 3.0 reflects no measurable bias toward protected sub-populations
Case study: Bankcard
0.0
2.0
4.0
6.0
8.0
10.0
12.0
500 525 550 575 600 625 650 675 700 725 750 775 800 825 850
ScoreBands
TestChi-Square
Cri calValue
22 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Case studies
Secured lending
23 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Added complexity…
► Underwriting driven by multiple criteria, principally home value and income
► Home values were severely stressed during the recession
► Credit score role in mortgage origination decisions prior to 2009 was overwhelmed by factors that materially contributed to default rates
Evaluation dataset design
► Exclude originations made prior to 2009
► Incorporate a price-to income (PTI) filter to capture “ability to repay” capacity
● Append U.S. Census American Community Survey data, median home owner household income by ZIP Code™
Mortgage – do VantageScore® 3.0 scores exhibit bias toward certain ethnic populations?
24 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Sample of 860,000 consumers with originated mortgages from 2009 onwards and ‘sound’ PTI <= three
Ethnicity weighting was assigned based on the ZIP Code™ on credit file
► Sub-populations
● AOMC – African American
● AOHC – Hispanic
● Non-AOMC/AOHC – neither African American or Hispanic
Evaluate default rate to score alignment graphically and statistically
Mortgage – do VantageScore® 3.0 scores exhibit bias toward certain ethnic populations?
25 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
0
0.1
0.2
0.3
0.4
0.5
0.6
500 525 550 575 600 625 650 675 700 725 750 775 800 825 839
Probab
ilityofDefault(90DaysorMorePastDue)
VantageScore3.0Range
Non-AOMC/AOHC AOMC LowerAOMC UpperAOMC
AOHC LowerAOHC UpperAOHC Overall
Graphically, some separation is observed in default rate profiles
All profiles still appear within confidence intervals
Case study: Mortgage
26 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
0
0.1
0.2
0.3
0.4
0.5
0.6
500 525 550 575
90+DaysPastDue
NonAOMC/AOHC LowerAOMC AOMC UpperAOMC
LowerAOHC AOHC UpperAOHC Overall
While there is greater separation in profiles, both Hispanic and African American sub-population profiles remain within the confidence intervals
Case study: Mortgage
27 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
No test statistic exceeds the critical value
VantageScore® 3.0 reflects no measurable bias toward protected sub-populations
Case study: Mortgage
0.000
2.000
4.000
6.000
8.000
10.000
12.000
500525550575600625650675700725750775800825850
TestChi-Square
Cri calValue
0.000
2.000
4.000
6.000
8.000
10.000
12.000
500525550575600625650675700725750775800825850
TestChi-Square
Cri calValue
28 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
We’ve discussed a methodology and key considerations for measuring whether a score is reflecting bias
What if the underlying model design causes consumers of a particular sub-population to actually become unscoreable?
As a result of the recession, many consumers have reduced their credit usage in terms of number of open accounts and frequency of usage
► If the frequency of usage falls below a threshold level necessary to be scored by certain models then the consumer becomes unscoreable
One more form of possible bias…
29 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Why are some consumers unscoreable by conventional models?
Conventional model
criteria
At least one trade with 6 months history (new to market)
At least one trade updated within a 6-month window (infrequent user)
No activity in the last 24 months (rare credit user)
At least one open trade
Unconventional models
30-35 million consumers are not scored by conventional models
Approximately 9 million of these consumers are African American or Hispanic
3 million of these consumers score above 600
30 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
For higher concentration regions, 20%-23% of African American consumers are unscoreable by conventional models
African American consumers
89% 86% 85% 83% 82% 80% 78% 78% 77% 77%
11% 14% 15% 17% 18% 20% 22% 22% 23% 23%
0-10% 10-20% 20-30% 30-40% 40-50% 50-60% 60-70% 70-80% 80-90% 90-100%
No Scores
Scored
Scored/no score consumers by ZIP Code™ band
31 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Newer credit score models can score these consumers, avoiding bias exposure
32 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Bias testing methodologies can be effectively used to identify credit score model biases
Certainly, these methodologies have required some refinements given the confounding effects of more granular underwriting strategies and stressed asset values for secured lending products
Moreover hidden biases may exist which may additionally impact your business opportunity
If bias is uncovered, develop a plan to eliminate the bias
Wrap-up
33 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
For additional information, please contact:
Hear the latest from Vision 2014
in the Daily Roundup:
www.experian.com/vision/blog
@ExperianVision | #vision2014
Follow us on Twitter
34 © 2014 Experian Information Solutions, Inc. All rights reserved. Experian Public.
Visit the Experian Expert Bar to learn more about
the topics and products covered in this presentation.