aml customer risk rating - sas.com · customer risk in our experience at sas working with clients,...

12
White Paper AML Customer Risk Rating Modernize customer risk rating models to meet risk governance regulatory expectations

Upload: others

Post on 18-Oct-2019

2 views

Category:

Documents


0 download

TRANSCRIPT

White Paper

AML Customer Risk RatingModernize customer risk rating models to meet risk governance regulatory expectations

Contributors

Edwin Rivera, Senior AML Analytics Consultant for Fraud and Compliance Solutions, SAS

Jim West, Senior AML Analytics Consultant for Fraud and Compliance Solutions, SAS

Carl Suplee, Senior Solutions Architect for Banking Security Intelligence Practice, SAS

Jason Grasso, Solutions Architect, Security Intelligence Practice, SAS

ContentsExecutive Summary .........................................................1

Comparing Heuristic Rule-Based Models to Statistical Models ..............................................................1

Heuristic Rule-Based Models ..............................................2

Statistical Models ...................................................................2

Statistical Models: The Preferred Modeling Approach ..............................................................3

Why Ordinal Logistic Regression for Customer Risk Rating? ..........................................................3

Conclusion: A More Effective Method for Managing Customer Risk ...............................................3

Appendix A: Technical and Procedural Aspects of Customer Risk Rating Model Development Using Ordinal Logistic Regression ...............................4

How to Develop a Customer Risk Rating Model Using Ordinal Logistic Regression ....................................4

Develop an Accurate Target Variable ................................4

Evaluate Multicollinearity .....................................................4

Identify Zero-Count Cells .....................................................5

Test the Proportional Odds Assumption ..........................6

Develop and Test the Model ..............................................6

Select Variables ......................................................................7

Assess Output From the Model ..........................................7

Deploy the Model ..................................................................9

Validate the Model on an Ongoing Basis ........................9

Contact Information .........................................................9

1

Executive Summary Assessing customer risk is an essential component of a compre-hensive Bank Secrecy Act/Anti-Money Laundering (BSA/AML) monitoring program. As the FFIEC BSA/AML Examination Manual clearly explains, customer due diligence begins with verifying each customer’s identity and assessing the associated risk. Firms must then establish processes to provide the addi-tional scrutiny necessary for higher-risk customers.

In light of the Supervisory Guidance on Model Risk Management (OCC 2011-12/FED SR 11-7), financial institutions are re-evaluating their customer risk rating models. More financial institutions are moving their heuristic, rule-based customer risk rating models to statistical models, specifically ordinal logistic regression models. These statistical models perform better than rules-based models, are easier to justify to the regulators and are easier to update, validate and maintain because they use an established and understood framework.

As firms look to improve their customer risk rating models, or implement models where they currently don’t exist, they often ask:

• What are the pros and cons of using heuristic rules versus statistical models?

• Why the regulatory push toward using statistical models?

• What type of statistical models should the firm implement?

• What attributes should the firm consider when developing the model?

Comparing Heuristic Rule-Based Models to Statistical Models Traditionally customer risk rating models have focused on risk rating customers in several distinct areas, often using multiple variables within a single area. Some of the variables used in a customer-rating model include:

• Customer relationship – personal, business, commercial, etc.

• Geography – country of residence, business location, high-intensity financial crime areas (HIFCA), high-intensity drug trafficking areas (HIDTA), port or border cities, etc.

• Account features – remote deposit capture, correspondent banking, online banking, custodial accounts, etc.

• High-risk customer – nonresident alien (NRA), politically exposed persons (PEP), money service businesses (MSB), employees, etc.

• Alert/filing history – manual alerts created, system generated alerts, cash transaction reports (CTRs), suspicious activity reports (SARs), etc.

• Expected product usage – wires (domestic or foreign), cash, automated clearing house (ACH), check, etc.

• Expected transactional activity (i.e., aggregate dollar amount of activity expected).

Both heuristic rule-based and statistical models consider the same basic customer data attributes. However, the underlying methodology and the way each model weights the variables to score the customers differs – often significantly.

More financial institutions are moving their heuristic, rule-based customer risk rating models to statistical models, specifically ordinal logistic regression models.

2

Heuristic Rule-Based ModelsA heuristic, rule-based model is simply an analytic formula used to assign a score based on one or more variables or attributes that are important to the firm. Because the relative importance of each individual variable is generally unknown, variable selec-tion is difficult. As a result, these models are often created using all available variables. As shown in Figure 1, these models are often parameterized to allow the user to adjust the scores and weights assigned to each component of the model.

Each customer’s scores are then aggregated, and the customer is assigned a risk category based on the aggregate score. Generally these models are based on “subject-matter expert judgment or knowledge,” rather than formal analysis. Since they follow no underlying methodology, the firm receives an endless supply of model design and scoring options, which makes validating the model that much more difficult.

The lack of an underlying statistical framework is a weakness of these models. It means there is no established statistical methodology for setting parameters or selecting variables to include in the model. Even after the general modeling framework has been set, these models require numerous iterations to determine which parameter settings maximize the model’s fit to the target variable. Additionally, there is no effective way to determine the optimal parameters. This makes it increasingly difficult to defend these models to regulators, especially in light of the Supervisory Guidance on Model Risk Management.

Heuristic, rules-based models were once the norm within the AML community due to their simplicity and ease of devel-opment. Today, however, they are being replaced by more scientific modeling approaches that can be more successfully defended to regulators and that enable a methodical approach to parameter setting and model validation.

Statistical ModelsStatistical models are based on well-established statistical methodologies and approaches that have been vetted, reviewed and published in academic journals. Most statistical models used for customer risk rating are predictive models, such as linear regression, binary/ordinal logistic regression, decision trees (all types) or neural networks. The particular appli-cation and risk-rating objectives determine the actual statistical model a firm would select. However, binary or ordinal logistic regression models are currently the most commonly used for rating customer risk.

Unlike heuristic, rule-based models, statistical models require certain assumptions be met to accurately assess the modeling framework with a certain degree of statistical confidence. A common goal when creating statistical models is to develop the simplest model (i.e., with the fewest number of variables) necessary to make an equally accurate prediction.

A robust statistical framework based on well-established, widely accepted modeling approaches allows firms to select model variables and coefficients that maximize the likeli-hood of estimating the target accurately. In addition, firms can effectively use standard approaches for assessing each

Variable Type Attributes Logic Description

Customer Relationship Customer Type • If customer is “Personal,” then score is 35.• If customer is “Commercial,” then score is 20.

High-Risk Customer Money Services Business (MSB)

• If customer is “MSB,” then the score is 80.

High-Risk Customer Politically Exposed Person (PEP)

• If customer is “PEP,” then the score is 80.

Alert/Filing History Suspicious Activity Reports (SARs)

• If SAR count equals 1, then the score is 45.• If SAR count is greater than 1, then the score is 60.

Expected Transaction Activity

Total Aggregated Transactions

• If the monthly transaction volume is less than $50K, then the score is 40.• If the monthly transaction volume is greater than or equal to $50K, then the score is 60.

Figure 1. Example of a heuristic customer risk rating model.

3

variable’s significance, the model’s overall goodness of fit, and its predictive power to defend the model to regulators. These factors also indicate whether the firm needs a different or more complex model.

Historically, statistical models have been less common in the AML community because they seem more complex to nonstatisticians. However, these models are quickly becoming standard due to the regulatory pressure to use more scientific approaches.

Statistical Models: The Preferred Modeling ApproachThe advantages of statistical modeling frameworks over heuristic, rule-based models are compelling. Statistical models are superior for their ability to:

• Identify the most effective variables.

• Select coefficients (i.e., weights) based on maximum likelihood estimation.

• Assess model strength.

• Estimate the confidence of model predictions.

The fact that the regulators prefer and better understand these methodologies only supports the decision to use such an approach.

Why Ordinal Logistic Regression for Customer Risk Rating?Once a firm decides to adopt a statistical model, it needs to find the right model for customer risk rating. There are several methods to choose from; all have slightly different objectives and advantages. However, ordinal logistic regression is very effective for developing a customer risk rating model.

Ordinal logistic regression differs from the binary model in that the target value can take on more than two ordered categories. While some logistic models support multiple categories for

1 Allison, Paul D. (2012), Logistic Regression Using SAS: Theory and Application, Second Edition, Cary, NC: SAS.

2 Ibid.

target variables that aren’t ordered (called multinomial), ordered categories are preferable for two reasons:

• Ordinal models are simpler than multinomial models, and therefore easier to interpret (Allison 2012).1

• The hypothesis tests for ordered models are more powerful than for multinomial models (Allison 2012).2

Appendix A describes how to develop a customer risk rating model using ordinal logistic regression.

}}Historically, statistical models have been less common in the AML community because they seem more complex to nonstatisticians. However, these models are quickly becoming standard due to the regulatory pressure to use more scientific approaches.

Conclusion: A More Effective Method for Managing Customer Risk In our experience at SAS working with clients, we’ve found that statistical models are the most effective way to classify customer risk. In particular, clients are modernizing their customer risk rating programs by moving from heuristic, rule-based models to statistical models, specifically ordinal logistic regression models. These statistical models perform better than rule-based models, are easier to justify to the regulators, and are easier to update, validate and maintain.

4

Appendix A: Technical and Procedural Aspects of Customer Risk Rating Model Development Using Ordinal Logistic Regression The most popular ordinal logistic regression model is the cumulative logit model. This is the default used by SAS/STAT® software. The cumulative logit model assumes that the model can be combined into multiple binary splits of the dichotomous target variable. However, the firm must initially test the propor-tional odds assumption using the score test. In cases where this assumption is severely violated and cannot be reasonably believed to hold, a multinomial or binary model is used instead.

While the model only produces one set of beta coefficients, the equation contains one less intercept constant than the number of target variables. The probability that an event will exist within each category is then calculated. The model assumes that the customer belongs to the category with the greatest probability (i.e., this is the model’s estimated category.)

How to Develop a Customer Risk Rating Model Using Ordinal Logistic Regression The following sections describe the preliminary analysis required to develop a customer risk rating model using ordinal logistic regression. By understanding the process at a high level, a firm can overcome the mystery and perceived complexity that surrounds these models. While this paper assumes the use of the SAS/STAT product, other products, such as SAS® Enterprise Miner™, also offer ordinal logistic regression.

Develop an Accurate Target VariableBefore exploring the data going into the model, you must evaluate the target variable for accuracy. The target variable for customer risk should reflect historical experience. If your firm isn’t confident that its current model is accurately assigning risk to its customers, it should sample and review customers across different risk levels and attribute values. When building and testing the ordinal logistic regression model, samples should be large enough to be statistically significant.

Evaluate MulticollinearityMulticollinearity occurs when two or more predictor variables (e.g., independent variables or covariates) in a regression model are highly correlated with each other. Specifically, it exists when one or more of the variables used in the model can be linearly predicted with a reasonable degree of accuracy using the other variables in the model. Note that we are only referring to the relationship between the predictive variables within the model. The predictive variables are expected to be correlated with the dependent or target variable.

When multicollinearity is present, the model’s estimated coefficients may change erratically in response to small changes in the data or model. While multicollinearity does not reduce the predictive power or reliability of the model, at least within the sample data used to train the model, it does affect calculations regarding individual predictions. A regression model with correlated predictor variables can indicate how well all variables predict the target variable. But it may not give valid results about any one predictor variable or about which variables are redundant.

Two approaches are commonly used to detect multicollinearity. In the first approach, the SAS/STAT correlation procedure is used to produce a correlation matrix between the predictive variables. It is important to select the Spearman correlation. The Spearman correlation coefficient is “nonparametric” and used when data is grouped rather than numeric. The following illustrates the CORRELATION procedure using SAS/STAT:

proc corr data=YourData

outs=CorrData(where=(UPCASE(_TYPE_)=’CORR’))

nomiss spearman;

var YourVariables;

run;

The second approach is to run a regression that includes all of the predictive variables and request the variance inflation factors (VIF). This will produce a VIF value for each variable. The VIF is the reciprocal of one minus the coefficient of deter-mination between the respective variable and the remaining predictor variables. Typically, a VIF greater than or equal to 4 indicates moderate multicollinearity while a VIF greater than or equal to 10 signifies high multicollinearity. The following illustrates how SAS/STAT calls that REGRESSION procedure:

proc reg data=YourData;

model YourVariables / TOL VIF;

run;

5

Identify Zero-Count CellsZero-count cells, or events for which there are no observations, can destabilize logistic regression results. Furthermore, the maximum likelihood estimate does not exist for the respective variable. If this is not addressed, SAS/STAT warns there can be quasi-complete or complete separation when referring to zero-count cells. However, the SAS program does not specify which variable the separation applies to.

As a result, it is important to identify zero-count cells before fitting the logistic regression model. To do so, your firm should generate cross-tabulation tables of individual predictor vari-ables versus target variables. In our examples, the firm uses five categories (low, medium, high-low, high-medium and high-high) to stratify high-risk customers.

Table 1 is an example of quasi-complete separation where there are no low, medium, or high-low risk customers with a Country Risk of 3.

Table 2 is an example of complete separation where any customer with a Country Risk of 3 falls in the high-high customer risk category.

To handle situations of quasi-complete and complete separa-tion, your firm can take the following actions:

• Remove variables causing the problem (if the variables contribute only marginally to the model).

• Combine categories if there are multiple categories in the variable.

• Define a rule outside of the model that automatically sets to high-risk customers that meet criteria that always result in their being considered high risk.

• Check if another variable is a dichotomous version of the variable in question.

• Grab more sample data that reflects what is missing, if possible.

Table 1. Example of quasi-complete separation.

Target Value (Risk)

Variable Data Type Category Low Medium High-Low High-Medium High-High

Country Risk Ordered Category 1 102,528 117 268 68 31

Country Risk Ordered Category 2 19,337 181 57 28 28

Country Risk Ordered Category 3 0 0 0 33 73

Count Target Value (Risk)

Variable Data Type Category Low Medium High-Low High-Medium High-High

Country Risk Ordered Category 1 102,528 117 268 68 31

Country Risk Ordered Category 2 19,337 181 57 28 28

Country Risk Ordered Category 3 0 0 0 0 106

Table 2. Example of complete separation.

6

SAS/STAT contains a procedure called SURVEYSELECT that analysts can use to create the build and test data sets. This procedure allows analysts to randomly split the data. The SAMPRATE option allows analysts to select the percentage by which to split the data.

The code below shows that the SAMPRATE is set to 0.7 (70 percent). The OUTALL option will keep all the records in the original data set and will create a new variable called SELECTED that will have a value of 1 if it was part of the 70 percent of the data and a value of 0 for the remaining 30 percent.

proc surveyselect data=YourData out=SplitData samprate=0.7 outall;

run;

Test the Proportional Odds AssumptionThe Proportional Odds Assumption tests whether the coef-ficients of the dichotomous groupings of the outcome variable are the same. Often in ordinal logistic regression, the Proportional Odds Assumption does not hold. This is widely understood but often ignored because, depending on the modeling objective, the practical implications can be minimal. In the SAS/STAT program, the “Score Test for the Proportional Odds Assumption” tests the hypothesis that the estimated coefficients are not materially different from each other regardless of the dichotomization.

Table 3 represents the mapped value of the target for the logistic regression model, and Table 4 represents the dichoto-mous groups used in the series of binary logistic regressions (i.e., 1 versus 2, 3, 4 and 5).

The null hypothesis says that there is no statistical difference in the estimated coefficients between models. The alterna-tive hypothesis says that there is a statistical difference in the estimated coefficients between models. If the p-value is high, we fail to reject the null hypothesis and can conclude that the estimates are not significantly different.

Output 1 shows an example of the score test results in SAS/STAT. Note that the score test rejects the null hypothesis more frequently than it should. In Categorical Data Analysis Using SAS, Stokes, Davis and Koch3 mention that this test needs at least five observations for each outcome of the category versus the target. In creating a cross-tab of a categorical variable versus the target variable, you need five or more observations in each cell. If this is not the case, the sample size might be too small, there could simply be no data (zero-cell), or it is a rare event.

Develop and Test the Model In ordinal logistic regression that considers many combinations of covariates, analysts use a holdout sample to test whether the model “truly” fits the data and doesn’t do so simply by chance. Analysts commonly build the model on roughly 70 percent of the data and test it on the remaining data (i.e., the holdout data set.) After testing, they can then run the model on the whole population. Now analysts can compare the outcome estimate percentages for the entire population to those obtained during model development and initial testing.

Target Model Value

Low 1

Medium 2

High-Low 3

High-Medium 4

High-High 5

Table 3. Example of target mapping.

Dichotomous Groups

0 1

1 2, 3, 4, 5

1, 2 3, 4, 5

1, 2, 3 4, 5

1, 2, 3, 4 5

Table 4. Example of dichotomous groups.

Score Test for the Proportional Odds Assumption

Chi-Square DF Pr > ChiSq

53.4698 24 0.0005

Output 1. Example of score test for proportional odds assumption.

3 Stokes, M.E., Davis, C.S. and Koch, G.G. (2012), Categorical Data Analysis Using SAS, Third Edition, Cary, NC: SAS.

7

proc logistic data=SplitData(where=(selected eq 1))

plots(only)=(effect(polybar)

oddsratio(range=clip)) descending outmodel=yourModel;

class yourOrdinalVariables / param=reference;

model risk= yourVariables / selection=forward rsq;

output out=yourBuildResults predprobs=individual;

run;

proc logistic inmodel=yourModel;

score data= SplitData(where=(selected eq 0)) plots)) fitstat

out=yourTestResults;

run;

Assess Output From the ModelOrdinal logistic regression calculates the probability of each risk level for a customer and assigns the risk level with the highest probability to the customer. Your firm can, in turn, use the risk assignments to assess the model’s predictive power using standard measures of association, which the SAS procedure can calculate. These predictive measures are derived from the concordant and discordant pairs observed within the data.

Output 2 shows example estimates of the predictive power of the model (note that the values go from 0 to 1, with larger

Select VariablesSAS offers several methods – forward selection, backward elimi-nation, and stepwise selection – to determine what variables to include in the ordinal logistic regression model. Your firm may also forego these selection methods and use specific variables that you deem important with respect to the target, in this case customer risk.

The forward selection procedure used in SAS systematically evaluates each available attribute and includes the one in the model that most improves model performance. The proce-dure then goes through the remaining attributes one by one to determine whether any others add significantly to model performance. The forward selection procedure terminates when no further effects can be added to significantly improve model performance or when all of the attributes are included.

The backward selection procedure used in SAS begins with all the attributes. It notes the variable with the smallest partial F-statistic. It systematically evaluates each available attribute and removes the ones with the most insignificant effect on model performance. The backward elimination procedure terminates when the variable with the smallest partial F-statistic is significant.

The stepwise selection procedure used in SAS systemati-cally evaluates available attributes one at a time to determine whether their removal or addition adds significantly to the model performance. The stepwise selection procedure termi-nates when no further effects can be added to or removed from the model to significantly improve model performance or when all attributes are included.

These variable selection procedures work well mathemati-cally. However, your firm should also select variables that will satisfy regulatory expectations when developing customer risk rating models. For instance, if the model does not select vari-ables regarding high-risk geography, be prepared to defend why these variables were not significant (e.g., all customers are located in high-risk areas). If your firm cannot determine a good reason for the variable’s exclusion, you should manually add the variable back into the model. The following SAS/STAT code builds an ordinal logistic regression model with forward selection on build data. It then uses the built model to score the test data.

The LOGISTIC ProcedureProbabilities modeled are cumulated over the

lower ordered values.

Association of Predicted Probabilities and Observed Responses

Percent Concordant 96.9 Somers’ D 0.947

Percent Discordant 2.2 Gamma 0.955

Percent Tied 0.9 Tau-a 0.708

Pairs 5705 c 0.973

Output 2. Example ordinal logistic regression results.

Fit Statistics for SCORE Data

R-Square Max-Rescaled R-Square

AUC Brier Score

0.758697 0.800536 . 0.308647

8

The logistic regression procedure can provide an output data set containing the predicted customer risk along with the predicted probabilities for each level of risk. This is useful because firms generally want a score associated with the risk level assigned to each customer. A firm can use the sum of the weighted probabilities to calculate the score.

In the case of five risk levels (1-5), the score would fall between 1 and 5. However, you can apply a scale by multiplying the weighted probability of the score by some factor. The example below assigns Customer X a risk level of 5 (high-high) because it has the highest calculated probability. To calculate the weighted probability, add the weighted probabilities together. Since there are five risk groups, multiplying the sum of the weighted probabilities by 20 would result in scores ranging from 20 to 100. Figure 3 shows an example.

values signifying greater predictive power). The pseudo-coefficient of determination, often signified as R2, is another popular statistic used to assess the predictive power of the logistic model, where the Max Rescaled R2 adjusts the statistic to account for the fact that in a discrete outcomes model the R2 value often never actually equals a value of 1.

In general, tests of the model’s predictive power assess how well your firm can predict the target variable using the covari-ates (i.e., the predictive variables). It is possible to have a model that predicts the target variable very well but fails the goodness-of-fit tests. A model can also make poor predictions, but show very good fit. Predictive power is commonly measured using the association of Somers’ D, Gamma, Tau-a and c (or AUC) listed in the table above.

Another useful way to view model results is to generate a two-way contingency (cross-tabulation) table of the predicted target versus the actual target, as shown in Figure 2.

Figure 2. Example contingency table.

Model Error Severity (Combined Data)

Estimated Actual Target Row

Target Low Medium High-Low High-Medium High-High Total

Low 57 1 1 59

Medium 2 47 4 53

High-Low 1 32 13 2 48

High-Medium 2 18 3 23

High-High 3 25 28

Total 59 49 39 34 30 211

Field Key Error Rates Count Percent

Correct Prediction 179 84.83%

Inaccurate by 1 29 13.74%

Inaccurate by 2 3 1.42%

Inaccurate by 3 0 0.00%

Inaccurate by 4 0 0.00%

Total 211 100.00%

9

tasks such as periodic assessment of the model’s perfor-mance, reviewing that appropriate model controls are in place, determining that the model includes the right covariates, and adjusting the coefficients as needed. Validation may also include testing new variables that were not previously available or that were sparse with data.

Each year, your firm should also generate a model validation report that documents ALL model validation tests performed and the results. The report should contain:

• A description of the model, including parameters, input variables and strengths and weaknesses.

• Validation of all model components, including input data, assumptions, processing and reports.

• Evaluation of the model’s ongoing conceptual soundness, including relevant developmental evidence.

• Evidence of ongoing monitoring, including process verification and benchmarking.

• Outcomes analysis, including back-testing.

Contact InformationYour comments and questions are valued and encouraged. Please contact the authors at:

Edwin Rivera, SAS, [email protected]

Jim West, SAS, [email protected]

Carl Suplee, SAS, [email protected]

Jason Grasso, SAS, [email protected]

Deploy the ModelDeploying the model logic into production takes many steps. These include SAS-based web service, batch SAS processing, various queuing servers and recoding the model logic into the language that the operational production system expects. For this process, we assume that the model will be deployed as a SAS batch process – with the firm delivering input data matching the data delivered for the modeling process.

Validate the Model on an Ongoing Basis Once the customer risk rating model is established and operational, your firm must develop a plan for ongoing model validation as described in Supervisory Guidance on Model Risk Management. Deliverables include validation of the target variable used to train the model, validation of model perfor-mance, and a model validation report.

Your firm may want to assess the relationship between the target variable (Customer Risk Rating) and the resulting number of scenario alerts generated, case referrals, or SARs filed on the customer. This will allow your firm to document that customers rated with a Customer Risk Rating equal to 5 (“high-high”) are more likely to be involved in suspicious activity than customers with a Customer Risk Level equal to 4, 3, 2 or 1. If the target variable lacks the accuracy desired, your firm can update the target variable setting and retrain the model on the revised data set. This would allow the model to more accurately identify customers meeting the firm’s definition of “risky.”

Regulators expect that these, like all analytic models, are validated on an ongoing basis as described in Supervisory Guidance on Model Risk Management. Validation involves

Customer X

Low Medium High-Low High-Medium High-High

Probability 0.000000000 0.000000038 0.000004454 0.000943457 0.999052050

Weight 1 2 3 4 5

Weight * Probability 0.000000000 0.000000077 0.000013361 0.003773830 4.995260252

Weighted Probability 4.99904752

Scale to 100 Points 99.98095039

Figure 3. Example of scoring.

To contact your local SAS office, please visit: sas.com/offices

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Copyright © 2015, SAS Institute Inc. All rights reserved. 107824_S139946.0715