+ adaptive fraud detection by tom fawcett and foster provost tom fawcett foster provosttom fawcett...
Post on 31-Dec-2015
225 Views
Preview:
TRANSCRIPT
+
Adaptive Fraud DetectionAdaptive Fraud Detection
by by Tom Fawcett and and Foster Provost
Presented by: David SanderPresented by: David Sander
+OutlineOutline
Problem DescriptionProblem Description Cellular cloning fraud problemCellular cloning fraud problem Why it is importantWhy it is important Current strategiesCurrent strategies
Construction of Fraud DetectorConstruction of Fraud Detector FrameworkFramework Rule learning, Monitor construction, Evidence combinationRule learning, Monitor construction, Evidence combination
Experiments and EvaluationExperiments and Evaluation Data used in this studyData used in this study Data preprocessingData preprocessing Comparative resultsComparative results
ConclusionConclusion
Exam QuestionsExam Questions
2
+The ProblemThe Problem
How to detect suspicious changes in user behavior to identify and prevent cellular fraud Non-legitimate users, aka bandits, gain illicit access to a
legitimate user’s, or victim’s, account
Solution useful in other contexts Identifying and preventing credit card fraud, toll fraud, and
computer intrusion
3
+Cellular Fraud - CloningCellular Fraud - Cloning
Cloning FraudCloning Fraud A kind of A kind of Superimposition Superimposition fraud (parasite)fraud (parasite) Fraudulent usage is superimposed upon ( added to ) the Fraudulent usage is superimposed upon ( added to ) the
legitimate usage of an accountlegitimate usage of an account Causes inconvenience to customers and great expense to Causes inconvenience to customers and great expense to
cellular service providerscellular service providers
4
+Cellular communications andCellular communications andCloning FraudCloning Fraud
Mobile Identification Number Mobile Identification Number (MIN) and (MIN) and Electronic Serial Number Electronic Serial Number (ESN)(ESN) Identify a specific accountIdentify a specific account Periodically transmitted unencrypted whenever phone is onPeriodically transmitted unencrypted whenever phone is on
Bandits use MIN and ESN to fake a customer’s Bandits use MIN and ESN to fake a customer’s accountaccount Bandit can make virtually unlimited, untraceable calls at Bandit can make virtually unlimited, untraceable calls at
someone else’s expensesomeone else’s expense
5
+ Interest in reducing Cloning Interest in reducing Cloning FraudFraud Fraud is detrimental in several ways:Fraud is detrimental in several ways:
Fraudulent usage congests cell sitesFraudulent usage congests cell sites Fraud incurs land-line usage chargesFraud incurs land-line usage charges Crediting process is costly to carrier and inconvenient to the Crediting process is costly to carrier and inconvenient to the
customercustomer
6
+Strategies for dealing Strategies for dealing with cloning fraudwith cloning fraud
Pre-call MethodsPre-call Methods Identify and block fraudulent calls as they are madeIdentify and block fraudulent calls as they are made Validate the phone or its user when a call is placedValidate the phone or its user when a call is placed
Post-call MethodsPost-call Methods Identify fraud that has already occurred on an account so Identify fraud that has already occurred on an account so
that further fraudulent usage can be blockedthat further fraudulent usage can be blocked Periodically analyze call data on each account to determine Periodically analyze call data on each account to determine
whether fraud has occurred.whether fraud has occurred.
7
+Pre-call MethodsPre-call Methods
Personal Identification Number (PIN)Personal Identification Number (PIN) PIN cracking is possible with more sophisticated equipmentPIN cracking is possible with more sophisticated equipment
RF Fingerprinting RF Fingerprinting Method of identifying phones by their unique transmission Method of identifying phones by their unique transmission
characteristicscharacteristics
AuthenticationAuthentication Reliable and secure private key encryption methodReliable and secure private key encryption method Requires special hardware capability Requires special hardware capability An estimated 30 million non-authenticatable phones are in An estimated 30 million non-authenticatable phones are in
use in the US alone (in 1997)use in the US alone (in 1997)
8
+Post-call MethodsPost-call Methods
Collision DetectionCollision Detection Analyze call data for temporally overlapping callsAnalyze call data for temporally overlapping calls
Velocity CheckingVelocity Checking Analyze the locations and times of consecutive callsAnalyze the locations and times of consecutive calls
User ProfilingUser Profiling
9
+Another Post-call MethodAnother Post-call Method( Main focus of this paper )( Main focus of this paper )
User Profiling User Profiling Analyze calling behavior to detect usage anomalies Analyze calling behavior to detect usage anomalies
suggestive of fraudsuggestive of fraud Works well with low-usage customersWorks well with low-usage customers Good complement to collision and velocity checking Good complement to collision and velocity checking
because it covers cases the others might missbecause it covers cases the others might miss
10
Sample Frauded AccountSample Frauded Account
Date Time Day Duration Origin Destination Fraud1/01/95 10:05:01 Mon 13 minutes Brooklyn, NY Stamford, CT
1/05/95 14:53:27 Fri 5 minutes Brooklyn, NY Greenwich, CT
1/08/95 09:42:01 Mon 3 minutes Bronx, NY Manhattan, NY
1/08/95 15:01:24 Mon 9 minutes Brooklyn, NY Brooklyn, NY
1/09/95 15:06:09 Tue 5 minutes Manhattan, NY Stamford, CT
1/09/95 16:28:50 Tue 53 seconds Brooklyn, NY Brooklyn, NY
1/10/95 01:45:36 Wed 35 seconds Boston, MA Chelsea, MA Bandit
1/10/95 01:46:29 Wed 34 seconds Boston, MA Yonkers, NY Bandit
1/10/95 01:50:54 Wed 39 seconds Boston, MA Chelsea, MA Bandit
1/10/95 11:23:28 Wed 24 seconds Brooklyn, NY Congers, NY
1/11/95 22:00:28 Thu 37 seconds Boston, MA Boston, MA Bandit
1/11/95 22:04:01 Thu 37 seconds Boston, MA Boston, MA Bandit
11
+The Need to be AdaptiveThe Need to be Adaptive
Patterns of fraud are dynamic – bandits constantly Patterns of fraud are dynamic – bandits constantly change their strategies in response to new detection change their strategies in response to new detection techniquestechniques
Levels of fraud can change dramatically from month-to-Levels of fraud can change dramatically from month-to-monthmonth
Cost of missing fraud or dealing with false alarms Cost of missing fraud or dealing with false alarms change with inter-carrier contractschange with inter-carrier contracts
12
+
Automatic Construction of Profiling Fraud Automatic Construction of Profiling Fraud DetectorsDetectors
+One ApproachOne Approach
Build a fraud detection system by classifying calls as Build a fraud detection system by classifying calls as being fraudulent or legitimatebeing fraudulent or legitimate
However there are two problems that make simple However there are two problems that make simple classification techniques infeasible.classification techniques infeasible.
14
+Problems with simple Problems with simple classificationclassification ContextContext
A call that would be unusual for one customer may be typical A call that would be unusual for one customer may be typical for another customerfor another customer
Granularity (over fitting?)Granularity (over fitting?) At the level of the individual call, the variation in calling At the level of the individual call, the variation in calling
behavior is large, even for a particular userbehavior is large, even for a particular user
15
+In Summary: In Summary: Learning The ProblemLearning The Problem
1) Which phone call features are important?1) Which phone call features are important?
2) How should profiles be created?2) How should profiles be created?
3) When should alarms be raised?3) When should alarms be raised?
16
+DC-1 Fraud Detection StagesDC-1 Fraud Detection Stages
Stage 1: Rule LearningStage 1: Rule Learning
Stage 2: Profile MonitoringStage 2: Profile Monitoring
Stage 3: Combining EvidenceStage 3: Combining Evidence
19
+Rule Learning – the 1Rule Learning – the 1stst stage stage
Rule GenerationRule Generation Rules are generated locally based on differences Rules are generated locally based on differences
between fraudulent and normal behavior for each between fraudulent and normal behavior for each accountaccount
Rule Selection Rule Selection Then they are combined in a rule selection stepThen they are combined in a rule selection step
20
+Rule GenerationRule Generation
DC-1 uses the DC-1 uses the RLRL program to generate rules program to generate rules with certainty factors above user-defined with certainty factors above user-defined thresholdthreshold
For each Account, RL generates a For each Account, RL generates a ““locallocal”” set set of rules describing the fraud on that of rules describing the fraud on that account. account.
Example:Example:
(Time-of-Day = Night) AND (Location = Bronx) (Time-of-Day = Night) AND (Location = Bronx) FRAUD FRAUD
Certainty Factor = 0.89Certainty Factor = 0.89
21
+Rule SelectionRule Selection
Rule Rule generation step typically yields tens of generation step typically yields tens of thousands of rulesthousands of rules
If a rule is found in ( or covers ) many accounts then If a rule is found in ( or covers ) many accounts then it is probably worth usingit is probably worth using
Selection algorithm identifies a small set of general Selection algorithm identifies a small set of general rules that cover the accountsrules that cover the accounts
Resulting set of rules is used to construct specific Resulting set of rules is used to construct specific monitorsmonitors
22
+Profiling Monitors – the 2Profiling Monitors – the 2ndnd stagestage
Monitors have 2 distinct steps -Monitors have 2 distinct steps - Profiling step:Profiling step:
Monitor is applied to an account’s normal usage to measure Monitor is applied to an account’s normal usage to measure the accountthe account‘‘s normal activitys normal activity
Statistics are saved with the account.Statistics are saved with the account.
Use step:Use step: A monitor processes a single account-dayA monitor processes a single account-day References the normalcy measure from profilingReferences the normalcy measure from profiling Generates a numeric value describing how abnormal the Generates a numeric value describing how abnormal the
current account-day iscurrent account-day is
23
+Most Common Monitor Most Common Monitor TemplatesTemplates
ThresholdThreshold
Standard DeviationStandard Deviation
24
+Example for Standard Example for Standard DeviationDeviation
Rule Rule (TIME OF DAY = NIGHT) AND (LOCATION = BRONX)(TIME OF DAY = NIGHT) AND (LOCATION = BRONX) FRAUD FRAUD
Profiling StepProfiling Step the subscriber called from the Bronx an average of the subscriber called from the Bronx an average of 55 minutes minutes
per night with a standard deviation of per night with a standard deviation of 22 minutes. At the end of minutes. At the end of the Profiling step, the monitor would store the values (5,2) with the Profiling step, the monitor would store the values (5,2) with that account.that account.
Use stepUse step if the monitor processed a day containing if the monitor processed a day containing 33 minutes of airtime minutes of airtime
from the Bronx at night, the monitor would emit a zero; if the from the Bronx at night, the monitor would emit a zero; if the monitor saw monitor saw 1515 minutes, it would emit (15 - 5)/2 = 5. This value minutes, it would emit (15 - 5)/2 = 5. This value denotes that the account is five standard deviations above its denotes that the account is five standard deviations above its average (profiled) usage levelaverage (profiled) usage level
28
+ Combining Evidence from Combining Evidence from the Monitors – the 3the Monitors – the 3rdrd stage stage Weights the monitor outputs and learns a Weights the monitor outputs and learns a
threshold on the sum to produce high threshold on the sum to produce high confidence alarmsconfidence alarms
DC-1 uses Linear Threshold Unit (LTU)DC-1 uses Linear Threshold Unit (LTU) Simple and fastSimple and fast Enables good first-order judgmentEnables good first-order judgment
A Feature selection process is used toA Feature selection process is used to Choose a small set of useful monitors in the final detectorChoose a small set of useful monitors in the final detector Some rules don’t perform well when used in monitors, some Some rules don’t perform well when used in monitors, some
overlapoverlap Forward selection process chooses set of useful monitorsForward selection process chooses set of useful monitors
29
+Final Output of DC-1
Detector that profiles each user’s behavior based on several indicators
An alarm when sufficient evidence of fraudulent activity
30
+ Data InformationData Information
Four months of phone call records from the Four months of phone call records from the New York City areaNew York City area
Each call is described by 31 original attributesEach call is described by 31 original attributes
Some derived attributes are addedSome derived attributes are added Time-Of-Day Time-Of-Day (MORNING, AFTERNOON, TWILIGHT, EVENING, NIGHT)(MORNING, AFTERNOON, TWILIGHT, EVENING, NIGHT)
To-PayphoneTo-Payphone
Calls labeled as fraudulent using block Calls labeled as fraudulent using block creditingcrediting
32
+Data CleaningData Cleaning
Eliminated calls that were credited outside Eliminated calls that were credited outside of the range of fraudulent call timesof the range of fraudulent call times
Days with 1-4 minutes of fraudulent usage Days with 1-4 minutes of fraudulent usage were discarded.were discarded. May have credited for other reasons, such as wrong numberMay have credited for other reasons, such as wrong number
Call times were normalized to Greenwich Call times were normalized to Greenwich Mean Time for chronological sortingMean Time for chronological sorting
33
+Data DescriptionData Description
After monitor creation, data is separated into After monitor creation, data is separated into “Account Days”“Account Days”
Selected for Profiling, training and testing:Selected for Profiling, training and testing: 3600 accounts that have at least 30 fraud-free days of 3600 accounts that have at least 30 fraud-free days of
usage before any fraudulent usageusage before any fraudulent usage Initial 30 days of each account were used for profilingInitial 30 days of each account were used for profiling Remaining days were used to generate 96,000 account-Remaining days were used to generate 96,000 account-
daysdays Distinct training and testing accounts:10,000 account-days Distinct training and testing accounts:10,000 account-days
for training; 5000 for testingfor training; 5000 for testing 20% fraud days and 80% non-fraud days20% fraud days and 80% non-fraud days
34
+Output of DC-1 componentsOutput of DC-1 components
Rule learning: 3630 rulesRule learning: 3630 rules Each covering at least two accountsEach covering at least two accounts
Rule selection: 99 rulesRule selection: 99 rules
2 monitor templates yielding 198 2 monitor templates yielding 198 monitorsmonitors
Final feature selection: 11 monitorsFinal feature selection: 11 monitors
36
+The Importance Of Error CostThe Importance Of Error Cost
Classification accuracy is not sufficient to Classification accuracy is not sufficient to evaluate performanceevaluate performance
The costs of misclassification should be The costs of misclassification should be factored infactored in
Estimated Error Costs:Estimated Error Costs: False positive(false alarm): $5False positive(false alarm): $5 False negative (letting a fraudulent account-day go False negative (letting a fraudulent account-day go
undetected): $0.40 per minute of fraudulent air-timeundetected): $0.40 per minute of fraudulent air-time
Factoring in error costs requires second Factoring in error costs requires second training pass by LTU (Linear Threshold Unit)training pass by LTU (Linear Threshold Unit)
37
+Alternative Detection MethodsAlternative Detection Methods
Collisions + VelocitiesCollisions + Velocities Errors almost entirely due to false negativesErrors almost entirely due to false negatives
High Usage – detect sudden large jump in High Usage – detect sudden large jump in account usageaccount usage
Best Individual DC-1 MonitorBest Individual DC-1 Monitor (Time-of-day = Evening) ==> Fraud(Time-of-day = Evening) ==> Fraud
SOTA - State Of The ArtSOTA - State Of The Art Incorporates 13 hand-crafted profiling methodsIncorporates 13 hand-crafted profiling methods Best detectors identified in a previous studyBest detectors identified in a previous study
38
DC-1 Vs. AlternativesDC-1 Vs. Alternatives
Detector Accuracy(%) Cost ($) Accuracy at Cost
Alarm on all 20 20000 20
Alarm on none 80 18111 +/- 961 80
Collisions + Velocities
82 +/- 0.3 17578 +/- 749 82 +/- 0.4
High Usage 88+/- 0.7 6938 +/- 470 85 +/- 1.7
Best DC-1 monitor 89 +/- 0.5 7940 +/- 313 85 +/- 0.8
State of the art (SOTA)
90 +/- 0.4 6557 +/- 541 88 +/- 0.9
DC-1 detector 92 +/- 0.5 5403 +/- 507 91 +/- 0.8
SOTA plus DC-1 92 +/- 0.4 5078 +/- 319 91 +/- 0.8
39
+Shifting Fraud DistributionsShifting Fraud Distributions
Fraud detection system should adapt to Fraud detection system should adapt to shifting fraud distributionsshifting fraud distributions
To illustrate the above point - To illustrate the above point - One non-adaptive DC-1 detector trained on a One non-adaptive DC-1 detector trained on a
fixed distribution ( 80% non-fraud ) and fixed distribution ( 80% non-fraud ) and tested against range of 75-99% non-fraudtested against range of 75-99% non-fraud
Another DC-1 was allowed to adapt (re-train Another DC-1 was allowed to adapt (re-train its LTU threshold) for each fraud distributionits LTU threshold) for each fraud distribution
Second detector was more cost effective Second detector was more cost effective than the firstthan the first
40
41
Effects of Changing Fraud Distribution
0
0.2
0.4
0.60.8
1
1.2
1.4
75 80 85 90 95 100Percentage of non-fraud
Cost
Adaptive
80/20
+ConclusionConclusion
DC-1 uses a rule learning program DC-1 uses a rule learning program to uncover indicators of fraudulent to uncover indicators of fraudulent behavior from a large database of behavior from a large database of customer transactionscustomer transactions
Then the indicators are used to Then the indicators are used to create a set of monitors, which create a set of monitors, which profile legitimate customer profile legitimate customer behavior and indicate anomalies behavior and indicate anomalies
Finally, the outputs of the monitors Finally, the outputs of the monitors are used as features in a system are used as features in a system that learns to combine evidence to that learns to combine evidence to generate high confidence alarms generate high confidence alarms
44
+ConclusionConclusion
Adaptability to dynamic patterns of fraud Adaptability to dynamic patterns of fraud can be achieved by generating fraud can be achieved by generating fraud detection systems automatically from detection systems automatically from data, using data mining techniquesdata, using data mining techniques
DC-1 can adapt to the changing conditions DC-1 can adapt to the changing conditions typical of fraud detection environmentstypical of fraud detection environments
Experiments indicate that DC-1 performs Experiments indicate that DC-1 performs better than other methods for detecting better than other methods for detecting fraudfraud
45
+Question 1 Question 1
• What are the two major fraud detection categories, What are the two major fraud detection categories, differentiate them, and where does DC-1 fall under?differentiate them, and where does DC-1 fall under?
• Pre Call MethodsPre Call Methods
• Involves validating the phone or its user when a call is placedInvolves validating the phone or its user when a call is placed
• Post Call Methods – DC1 falls herePost Call Methods – DC1 falls here
• Analyzes call data on each account to determine whether cloning Analyzes call data on each account to determine whether cloning fraud has occurredfraud has occurred
47
+Question 2Question 2
• Why do fraud detection methods need to be adaptive?Why do fraud detection methods need to be adaptive?
• Bandits change their behavior- patterns of fraud dynamicBandits change their behavior- patterns of fraud dynamic
• Levels of fraud varies month-to-monthLevels of fraud varies month-to-month
• Cost of missing fraud or handling false alarms changes between Cost of missing fraud or handling false alarms changes between inter-carrier contractsinter-carrier contracts
48
+Question 3Question 3
•What are the two steps of profiling What are the two steps of profiling monitors and and what are the two main monitors and and what are the two main monitor templates?monitor templates?
•Profiling Step: measure an accounts normal activity Profiling Step: measure an accounts normal activity and save statisticsand save statistics
•Use Step: process usage for an account-day to Use Step: process usage for an account-day to produce a numerical output describing how abnormal produce a numerical output describing how abnormal activity was on that account-dayactivity was on that account-day
• Threshold and Standard Deviation monitorsThreshold and Standard Deviation monitors
49
top related