ml fridays - aws

© 2020, Amazon Web Services, Inc. or its Affiliates.

Arun Kumar Lokanatha – AI/ML solution architect

Date – 09/04/2021

ML FridaysFraud Detection using Deep Graph Networks


Table of contents

• Fraud Types and Market Drivers

• Rule based Systems

• Fraud Detection using Machine Learning

• Supervised

• Unsupervised

• Fraud Detection using Deep Learning

• Autoencoders

• Deep Graph Networks

• Code Examples , Demo and Getting Started guides.


Fraud comes in all shapes and forms

Payment fraud• Compromised payment instruments (e.g., stolen cards)

• Intentional nonpayment (e.g., prepaid cards)


Payment fraud• Compromised payment instruments (e.g., stolen cards)


Account takeover or compromise• User name and password

• API key




Payment fraud

• Compromised payment instruments (e.g., stolen cards)


Account takeover or compromise

• User name and password

• API key

Abuse

• Free tier misuse

• Premium phone number


Market Drivers


Fraud is big business

120% Account takeover losses

reached $5.1 billion in

2017, down from 280%

growth in 2015. - Javelin Research

113% The increase in application

fraud 2016- Forester

$130B Expected loss by retailers

to card-not-present in the

next 5 years- Juniper Research

53% Increase in Imposter

Scams. - FTC

28%Global New Account Fraud

Increased 28% in 2019 - Jumio

$5.1TGlobal Cost of Fraud (2019)

- Crowe.com


Payment Fraud Trends

$22.8b

2016

$27.9b

2018

$32.9b

2021

$35.7b

2023

Data fromThe Nilson Report, November 2019, Issue 1164 (https://nilsonreport.com/upload/content_promo/The_Nilson_Report_Issue_1164.pdf)


Fraud Prevention strategies


Fraud prevention strategy

Prevention Detection Containment Remediation


The Fraud Detection filter

All incoming requests

Incorrectly APPROVED (FN)

Incorrectly

DECLINED

(FP)

Correctly

DECLINED

(TP)

Correctly APPROVED (TN)

Fraud

Detection

Filter

You lose money

(chargebacks & fees)

You lose customers & money

• increased churn

• negative reviews

• lost revenue

• A trade-off between False

Positives vs False Negatives


Trade-off efficiency

False Negatives

False Positives

Decrease

fraud losses

Increase

• revenue loss

• negative reviews

• customer churn


Endpoint authentication –e.g. stolen card or machine

Types of Fraudulent Behaviour

Layer 1

Anomaly within a session –e.g. transfer before balanceLayer 2

Anomaly within an account–e.g. Unusual spikes in transferLayer 3

Anomaly within multiple channels of the same account–-e.g. spikesLayer 4

Anomaly within multiple challens and multiple accounts–e.g. Irregular transferLayer 5

Fraudulent transactions are anomalies

Fraud detection in its core is an anomaly detection problem


Outliers

Kaggle visa dataset

https://www.kaggle.com/mlg-ulb/creditcardfraud


Fraudsters continually attack using different methods


There is no silver bullet algorithm or solution to Prevent

Fraud


Rule Based Fraud Detection

© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |

Rule-based Fraud Detection

if IP_ADDRESS_LOCATION is ’Japan’ andCUST_ADDRESS_COUNTRY is ‘Japan’ andCUSTOMER_PHONE_LOC is ‘Spain’

thenInvestigate

Rules look for specific conditions or behaviors

Pros

• Straight forward to implement

• Easy to explain

© 2020 Amazon Web Services, Inc. or its affiliates. All rights reserved |

Difficulties with rule-based Fraud Detection

$$$ billions lost to

fraud each yearDifficult to adapt to new

fraud patterns

Rules = more human

reviews

Dependent on experts to

update detection logic

Lower FP vs FN trade-

off efficiency


Machine Learning for Fraud

Detection


Using Machine Learning Algorithms for Fraud Detection

• Problem: What to do when we don’t have annotated data, but want to

identify potentially fraudulent transactions?

• Two “flavors” of machine learning:

• Supervised: Access to labeled data

• Unsupervised: Access to features alone


Using Machine Learning for Supervised Learning

Feed

labeled

data to

algorithm

Discover

relationships

between input

and output

Apply solution

to unseen data

Make

predictions


Using Machine Learning for Unsupervised Learning

Feed raw

data to

algorithm

Uncover hidden

patterns

Automatically

flag anomaliesInvestigate

potential fraud


Solution

Train

XGBoost

Model using

SageMaker*

Train

Random Cut

Forest Model

using

SageMaker*

Labeled Data

Unlabeled Data

Deploy

XGBoost

Model

Deploy

Random

Cut Forest

Model

Live Datae.g. incoming, real-time transactions

Predictionse.g. Anomalous

transactions, fraud

* SageMaker built-in algorithm


Solution Architecture

Amazon API Gateway AWS Lambda

Amazon SageMaker

(XGBoost)

Amazon SageMaker

(Random Cut Forest)

Amazon S3 bucket

(Model and Data)

Amazon S3 bucket

(Results)

Amazon QuickSight

Anomaly Detection

Amazon Kinesis

Data Firehose

Fraud Detection

Optional

Transactions


The challenge

• Coming out with features is difficult

• Time consuming

• Requires domain knowledge

• Time for tuning the features

• Labeled dataset is not available specifically for fraud detection scenarios

• Imbalanced datasets often need to add techniques to mitigate them.


Deep Learning for Fraud

Detection


Autoencoders


Learning features

• Anomaly != (Normal)

• Problem to solve “Learn Normal”

• Learning normal is easier problem to solve since data for (Normal) is exhaustively available


Limitations and Considerations

1. A drawback of the Autoencoder is that it does not

distinguish fraudulent and normal transactions with similar

reconstruction errors.

2. Typical way to mitigate is to build an Additional Model (MLP)

which can further classify the abnormal detections from

Autoencoders

3. This reduces the number of labelled samples needed to build

the Fraud Model.

4. All the above discussed models work on linear features and lot

of times the real word data is connected.


Graph Neural Networks


Overview of graphs and graph

neural networks


What are graphs

Abstract representation of relationships between entities.

Nodes

Edges

Homogeneous Heterogeneous


Graph learning Tasks

Node ClassificationFraud detection

target right customers

Link Prediction recommendations

missing relations in a knowledge graph

Graph Classificationpredict property of a chemical compound


Graph learning and Node Embeddings

Transform nodes to a numerical representation

• Embed nodes to a low-dimension space

• Embeddings capture the essential task-specific information

• For example, node similarities in the embedding space approximates the similarities in the original

graph.

Original Graph: Zachary’s Karate Club Embeddings: Representation in 2D


Traditional Graph learning techniques

Generate embeddings by manual feature engineering

• Requires domain expertise, involves considerable manual fine-tuning,

time consuming, does not scale, …

Automatically generate embeddings using unsupervised dimensionality

reduction approaches

• Singular value decomposition, tensor decomposition, co-factorization,

deep walks, etc.

• Cannot effectively combine rich attributes with network structure.

• Employ mostly (multi-)linear models.

• Do not allow for end-to-end learning.


Graph Neural Networks

A family of (deep) neural networks that learn node, edge, and graph

embeddings


How Graph Neural Networks work

Graph Neural Networks are based on Message Passing

v1

v5

V2

v3v4

h2

h3

h5

h4

h1

AGGREGATE

COMBINE

h2

h5

h3

h4

m1h1

Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., & Dahl, G. E. (2017, August). Neural message passing for quantum chemistry.

Xu, K., Hu, W., Leskovec, J., & Jegelka, S. (2018). How powerful are graph neural networks?

And


Graph Neural Network models

GCN

𝑀𝑣𝑤(𝑙)

= ℎ𝑤𝑙−1

𝑑𝑣+1𝑀𝑣𝑤(𝑙)

= 𝛼𝑣𝑤ℎ𝑤𝑙−1

GAT

𝑀𝑣𝑤(𝑙)

= 1

𝑐𝑣,𝑟𝑊𝑟

(𝑙)ℎ𝑤𝑙−1

R-GCN

ℎ𝑣(𝑙)= 𝜙(𝑚𝑣

𝑙𝑊(𝑙))

𝑚𝑣(𝑙)=

𝑤∈𝑁 𝑣 ∪{𝑣}

𝑀𝑣𝑤𝑙

AGGREGATE

COMBINE

Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks.

Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks.Schlichtkrull, M., Kipf, T. N., Bloem, P., Van Den Berg, R., Titov, I., & Welling, M. (2018, June). Modeling relational data with graph convolutional networks.


Deep Graph Library (DGL) for Deep Learning on Graphs

1. DGL is a toolkit for Deep Learning

on graphs.

2. Supports all popular frameworks.

3. Comes with pre defined Graph

Networks for GCN, R-GCN , GAT etc

4. Makes it easier to build your own

custom networks by providing

common functions.

5. Benefits of using DGL on Amazon

SageMaker:

- Efficiently train models for

graphs with up to millions of

nodes and billions of edges.

https://docs.aws.amazon.com/sagemaker/latest/dg/deep-graph-library.html

https://docs.aws.amazon.com/sagemaker/latest/dg/deep-graph-library.html


Fraud detection with graph

neural networks


Fraud Detection with Graphs

• Common Issue: Fraudsters can evolve to fool rules based

methods or simple feature based methods

• Observation: Fraudsters cannot mask their behavior with respect

to the full interaction graph

• Often connected objects -> guilt-by-association

• Combine weak signals from individual nodes to derive stronger ones

• Node Aggregation : fraudulent or malicious users tend to connect

with many other users or entities

• Activity Aggregation : fraudulent or malicious users linked to

accounts that act in a coordinated fashion in short bursts of time


Fraud Detection - Formulation

Context

User signs-up and once some usage data

is collected, predict if user is fraud or not.

User DeviceID IP Address MAC Address Label

00049 990000862471

854

216.3. 128.12 00:0a:95:9d:68:16. 0

⋮ ⋮ ⋮ ⋮ ⋮

06302 351756051523

999

66.249.64.163 00:0a:95:9d:68:16. 1


Fraud Detection – User Features

User Time stamp Activity Success Trans_amt Bal Amt

00049 09/10/2007 @

12:45amChangedPassword 0 N/A N/A

⋮ ⋮ ⋮ ⋮ ⋮ ⋮

06302 09/11/2007 @

11:45pm

BalanceTransfer 1 3419 0

user day 1

⋯ day 30

Activity_x success Trans_amt

hour 0

⋯hour 23

⋯hour 0

⋯hour 23 ≤ 103

⋯> 106

0004

9

18 ⋯ 18 ⋯ 0 ⋯ 0 3 -15 8 ⋯ 0

Feature Transformation


Using Graph Neural Networks to Detect Fraud

𝑥11 ⋯ 𝑥1𝑚⋮ ⋱ ⋮𝑥𝑛1 ⋯ 𝑥𝑛𝑚

User features

Graph

R-GNN

Layer

R-GNN

LayerClassifier

User

embeddings…


Fraud Detection – GNN Results


Code Walkthrough

https://github.com/awslabs/

sagemaker-graph-fraud-

detection

https://github.com/awslabs/sagemaker-graph-fraud-detection


Architecture using Amazon SageMaker


Resources


Resources

• Random Cut Forest Documentation: AWS Docs

• XGBoost Documentation: AWS Docs

• Imbalanced-learn (SMOTE): Library Documentation

• SMOTE original publication: On ArXiv

• Learning from imbalanced data review article: DOI Link

• Fraud Detection using Auto encoders - Github Link

https://docs.aws.amazon.com/sagemaker/latest/dg/randomcutforest.html

https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html

https://imbalanced-learn.readthedocs.io/en/stable/auto_examples/over-sampling/plot_comparison_over_sampling.html#sphx-glr-auto-examples-over-sampling-plot-comparison-over-sampling-py

https://arxiv.org/abs/1106.1813

https://doi.org/10.1016/j.eswa.2016.12.035

https://github.com/aws-samples/amazon-sagemaker-fraud-detection


Getting Started

All code can be found on

GitHub:

https://github.com/awsla

bs/sagemaker-graph-

fraud-detection

https://github.com/awslabs/sagemaker-graph-fraud-detection


Q&AArun Kumar Lokanatha

AI/ML Solution Architect


Thank You

Arun Kumar L

ml fridays - aws

Documents