credit card fraud detection using fire fly algorithm

5
@ IJTSRD | Available Online @ www ISSN No: 245 Inte R Credit Card Frau S. Senthil Kumar Assistant Professor Dr. SNS Rajalakshmi College of Arts (Autonomous), Coimbatore, Tamil N ABSTRACT Data Mining or Knowledge Discovery make sense and use of data. Knowledge data is the non-trivial process of iden novel, potentially useful and understandable patterns in data [1]. consists of more than collection and ma also includes analysis and prediction. Pe do mistakes while analyzing or, possibly to establish relationships between mul This makes it difficult for them to fin certain problems. Classification models predict categorica and prediction models predict conti functions. For example, we can build a model to categorize bank loan applicat safe or risky, or a prediction model expenditures in dollars of potential computer equipment given their occupation. Keywords: Data Mining, Credit car Classification INTRODUCTION The feature selection process is emb classification algorithm, in order to ma selection process sensitive to the algorithm. This approach recognizes the w.ijtsrd.com | Volume – 1 | Issue – 6 | Sep - Oct 56 - 6470 | www.ijtsrd.com | Volum ernational Journal of Trend in Sc Research and Development (IJT International Open Access Journ ud Detection using Fire Fly Alg And Science Nadu, India Ms. D. Ni PG Stude Dr. SNS Rajalakshmi Colleg (Autonomous), Coimbator y is needed to e Discovery in ntifying valid, d ultimately Data mining anaging data; it eople are often y, when trying ltiple features. nd solutions to al class labels; inuous valued a classification tions as either to predi ct the customers on income and rd fraudulent, bedded into a ake the feature classification e fact that different algorithms may wo features. The strategy of co generated by an induction a combiner determines the o outputs of the individual induc WORKING OF CLASSIFIC With the help of the bank lo have discussed above, let us of classification. The Data includes two steps, Building the Classifier Using Classifier for Cl Building the Classifier or Mo This step is the learning step o this step the classification classifier. The classifier is bu made up of database tuples an labels. Each tuple that const referred to as a category or class referred to as sample, object or d t 2017 Page: 728 me - 1 | Issue 6 cientific TSRD) nal gorithm ivya ent ge of Arts And Science re, Tamil Nadu, India ork better with different ombining the classifiers algorithm. The simplest output solely from the cers. CATION oan application that we understand the working Classification process r or Model lassification odel or the learning phase. In algorithms build the uilt from the training set nd their associated class itutes the training set is s. These tuples can also be data points.

Upload: ijtsrd

Post on 12-Aug-2019

5 views

Category:

Education


0 download

DESCRIPTION

Data Mining or Knowledge Discovery is needed to make sense and use of data. Knowledge Discovery in data is the non trivial process of identifying valid, novel, potentially useful and ultimately understandable patterns in data 1 . Data mining consists of more than collection and managing data it also includes analysis and prediction. People are often do mistakes while analyzing or, possibly, when trying to establish relationships between multiple features. This makes it difficult for them to find solutions to certain problems. Classification models predict categorical class labels and prediction models predict continuous valued functions. For example, we can build a classification model to categorize bank loan applications as either safe or risky, or a prediction model to predict the expenditures in dollars of potential customers on computer equipment given their income and occupation. S. Senthil Kumar | Ms. D. Nivya "Credit Card Fraud Detection using Fire Fly Algorithm" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-6 , October 2017, URL: https://www.ijtsrd.com/papers/ijtsrd4672.pdf Paper URL: http://www.ijtsrd.com/other-scientific-research-area/other/4672/credit-card-fraud-detection-using-fire-fly-algorithm/s-senthil-kumar

TRANSCRIPT

Page 1: Credit Card Fraud Detection using Fire Fly Algorithm

@ IJTSRD | Available Online @ www.ijtsrd.com

ISSN No: 2456

InternationalResearch

Credit Card Fraud Detection using Fire Fly Algorithm

S. Senthil Kumar Assistant Professor

Dr. SNS Rajalakshmi College of Arts And Science (Autonomous), Coimbatore, Tamil Nadu, India

ABSTRACT Data Mining or Knowledge Discovery is needed to make sense and use of data. Knowledge Discovery in data is the non-trivial process of identifying valid, novel, potentially useful and ultimately understandable patterns in data [1]. Data mining consists of more than collection and managing data; it also includes analysis and prediction. People are often do mistakes while analyzing or, possibly, when trying to establish relationships between multiple features. This makes it difficult for them to find solutionscertain problems. Classification models predict categorical class labels; and prediction models predict continuous valued functions. For example, we can build a classification model to categorize bank loan applications as either safe or risky, or a prediction model to prediexpenditures in dollars of potential customers on computer equipment given their income and occupation. Keywords: Data Mining, Credit card fraudulent, Classification INTRODUCTION The feature selection process is embedded into a classification algorithm, in order to make the feature selection process sensitive to the classification algorithm. This approach recognizes the fact that

@ IJTSRD | Available Online @ www.ijtsrd.com | Volume – 1 | Issue – 6 | Sep - Oct 2017

ISSN No: 2456 - 6470 | www.ijtsrd.com | Volume

International Journal of Trend in Scientific Research and Development (IJTSRD)

International Open Access Journal

Credit Card Fraud Detection using Fire Fly Algorithm

SNS Rajalakshmi College of Arts And Science Nadu, India

Ms. D. NivyaPG Student

Dr. SNS Rajalakshmi College of Arts And Science (Autonomous), Coimbatore, Tamil Nadu, India

Data Mining or Knowledge Discovery is needed to make sense and use of data. Knowledge Discovery in

trivial process of identifying valid, novel, potentially useful and ultimately understandable patterns in data [1]. Data mining

ore than collection and managing data; it also includes analysis and prediction. People are often do mistakes while analyzing or, possibly, when trying to establish relationships between multiple features. This makes it difficult for them to find solutions to

Classification models predict categorical class labels; and prediction models predict continuous valued functions. For example, we can build a classification model to categorize bank loan applications as either safe or risky, or a prediction model to predict the expenditures in dollars of potential customers on computer equipment given their income and

Data Mining, Credit card fraudulent,

s is embedded into a tion algorithm, in order to make the feature

selection process sensitive to the classification algorithm. This approach recognizes the fact that

different algorithms may work better wfeatures. The strategy of combining the classifiers generated by an induction algorithm. The simplest combiner determines the output solely from the outputs of the individual inducers. WORKING OF CLASSIFICATION

With the help of the bank loanhave discussed above, let us understand the working of classification. The Data Classificatincludes two steps,

Building the Classifier or Model

Using Classifier for Classification

Building the Classifier or Model This step is the learning step or the learning phase.this step the classification algorithms build the classifier. The classifier is built from the training set made up of database tuples and their associated class labels. Each tuple that constitutes the training set is referred to as a category or class. These tuples can also be referred to as sample, object or data points.

Oct 2017 Page: 728

www.ijtsrd.com | Volume - 1 | Issue – 6

Scientific (IJTSRD)

International Open Access Journal

Credit Card Fraud Detection using Fire Fly Algorithm

Nivya PG Student

SNS Rajalakshmi College of Arts And Science (Autonomous), Coimbatore, Tamil Nadu, India

different algorithms may work better with different f combining the classifiers

erated by an induction algorithm. The simplest combiner determines the output solely from the outputs of the individual inducers.

WORKING OF CLASSIFICATION

With the help of the bank loan application that we have discussed above, let us understand the working of classification. The Data Classification process

Building the Classifier or Model

Using Classifier for Classification

Building the Classifier or Model

is the learning step or the learning phase. In this step the classification algorithms build the

The classifier is built from the training set made up of database tuples and their associated class

tuple that constitutes the training set is or class. These tuples can also be

referred to as sample, object or data points.

Page 2: Credit Card Fraud Detection using Fire Fly Algorithm

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

@ IJTSRD | Available Online @ www.ijtsrd.com | Volume – 1 | Issue – 6 | Sep - Oct 2017 Page: 729

Fig: (a) Classifier Builder

Using Classifier for Classification

The classifier is used for classification. Here the test

data is used to estimate the accuracy of classification

rules. The classification rules can be applied to the

new data tuples if the accuracy is considered

acceptable.

Fig: (b) Classifier Usage for Classification

PROBLEM DEFINITION

The studies on the use of transaction data classification in actual applications is lacking in the literature. The dataset for credit card applications is minimum and not available more. The main problem is to find whether the transaction in the dataset is fraudulent or not. Only single classifier is used for detecting fraudulent transaction in existing literature

but in the proposed work three base classifiers and one meta classifier is used.

METHODOLOGY

The main idea of ensemble methodology is to combine a set of models, each of which solves the same original task, in order to obtain a better composite global model, with more accurate and reliable estimates or decisions than can be obtained from using a single model. The idea of building a predictive model by integrating multiple models has been under investigation for a long time. The ensembles techniques are divided into two main categories are Decision optimization and Coverage optimization.

The proposed model is trained with few transactions so that it will be easier to detect frauds and which is further developed with corrections for future references to efficiently detect the fraud. The main aim of this proposed method is to improve the classification accuracy. CREDIT CARD DATASET CLASSIFICATION USING KNN

The transaction date is taken as a feature for classification. The general goal is to make accurate predictions about unknown data after being trained on known data. Data comes in form of examples with the general form: w1, .., wn are also known as features, inputs or dimensions v is the output or class label. Both wi and vs can be discrete (taking on specific values) {0, 1} or continuous (taking on a range of values) [0, 1].

In training we are given (w1, .., wn, v) tuples. In testing (classification), we are given only (w1, .., wn) and the goal is to predict v with high accuracy.

FRAUD DETECTION

Pattern matching is not necessarily to be exact rather small variations can be accepted and if there exists big difference in pattern, then chances that particular transaction is illegal transaction is more. The output of neural network will be in between 0 and 1. If the output is below .6 or .7 it implies transaction legal and if output is above .7 then probability of a illegal transaction is high. in some occasions legal users may make transaction that will be quite different and sometimes fraudster make transactions that matches the pattern trained by neural network. Due to limitation problems, legal users will use card for

Page 3: Credit Card Fraud Detection using Fire Fly Algorithm

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

@ IJTSRD | Available Online @ www.ijtsrd.com | Volume – 1 | Issue – 6 | Sep - Oct 2017 Page: 730

limited amount but fraudster will try to do big purchase before the action taken by the credit card holder which will be a mismatch with the trained pattern by neural network. The process of business will be present always in neural network pattern recognition systems design. History descriptors provide details usage details of card and payments made. Other descriptors have information about date if issue and so on [1].

WORKING PRINCIPLE (PATTERN RECOGNITION)

Neural network based fraud detection is similar to the human brain working. Neural network made a computer to think as human brain that learns through past experience. The learning experience or knowledge is used to solve and make decision in problems in day today life. The same method is for credit card fraud detection. The consumer use fixed pattern of credit card use. This pattern is taken for past one or two years to train a neural network. The different other categories of information can also be stored like location for kids purchase, frequencies of huge purchase and so on in limited time. Neural network trains the various faces of credit card fraud along with credit card usage pattern which is provided by bank. Credit card usage pattern is taken by the prediction algorithm to differentiate fraudulent and non-fraudulent. Unauthorized user’s pattern is matched with original card holder’s pattern which is trained by neural network, and if pattern is same the decision made as genuine transaction.

BASED ON FREQUENT ITEM SET MINING

K. R. Seeja and Masoumeh Zareapoor proposed a credit card fraud detection model that detects fraud from highly imbalanced and anonymous credit card transaction datasets.

Frequent item set mining is used for finding legal and illegal patterns of transactions which handles the imbalance problem in class. To find whether the incoming transactions of the customers belongs to legal or illegal pattern , a matching algorithm is proposed and according that transaction closer to the patterns are identified and decisions are made. No special attention on attributes is given to manage the anonymous nature of transaction data and every attribute is treated equally for pattern finding. On UCSD Data Ming Contest 2009 Dataset, Evaluation of performance for this model is done and found to have less false alarm rate compared to state of the art classifiers, rate of fraud detection is high, classification rate is balanced, Matthews correlation coefficient.

ENSEMBLE CLASSIFIER - FIREFLY

The firefly algorithm follows three rules:

Fireflies must be unisex. Lighter firefly is attracted towards the randomly

moving brighter fireflies. The brightness of every firefly symbolizes the

quality of the solutions.

The diversity in the Firefly Algorithm (FA) optimization is depicted by the random movement component, whilst the intensification is unconditionally manipulated by the attraction of various fireflies and the strength of attractiveness. As opposed to the other meta-heuristics, the association between exploration and exploitation in FA are relatively inter-connected; this might be a significant factor for its success, in solving multi-objective and multi-modal optimization problems.

Page 4: Credit Card Fraud Detection using Fire Fly Algorithm

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

@ IJTSRD | Available Online @ www.ijtsrd.com | Volume – 1 | Issue – 6 | Sep - Oct 2017 Page: 731

At the first, it generates the initial population of candidate solutions for the given problem (here, the weights of the D-TREE, SVM, and KNN. After that, it calculates the light intensity for all fireflies and finds the attractive firefly (best candidate) within the population. Then, calculate the attractiveness and distance for each firefly to move all fireflies towards the attractive firefly in the search space. Finally, the attractive firefly moves randomly in the search space. This process is repeated until a termination criterion is met i.e., the maximum number of generations is reached. Normally, the quality of the transaction is measured based on the error rate that can be calculated based on True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN).

The procedure starts from an initial population of randomly generated individuals. The quality of each

individual is calculated using Eq. (1) and the best solution among him or her is selected.

CONCLUSION

In the above paper we discussed potential usefulness of ensemble methods; it is not surprising that a vast number of methods are now available to researchers and practitioners. There are several factors that differentiate between the various ensembles methods. For future work, we can introduce Machine Learning Algorithms for Class Imbalance problems.

Machine learning can often be successfully applied to these problems, improving the efficiency of systems and the designs of machines. There are several applications for Machine Learning (ML), the most significant of which is data mining.

Begin

Generate the initial solution randomly

Evaluate each individual in the population f(x) based on error rate

Find the best solution from the population

While (stopping criterion satisfied)

For i = 1 to n do

For j = 1 to n do

If (f (xj) < f (xi))

Calculate attractive fireflies

Calculate the distance between each fireflies i

and j

Move all firefly (xi) to the best solution (xj)

End if

End for j

End for i

Moves best solution randomly

Find the best solution from the new population

End while

Return best

End of the algorithm

Page 5: Credit Card Fraud Detection using Fire Fly Algorithm

International Journal of Trend in Scientific Research and Development (IJTSRD) ISSN: 2456-6470

@ IJTSRD | Available Online @ www.ijtsrd.com | Volume – 1 | Issue – 6 | Sep - Oct 2017 Page: 732

REFERENCES

[1]Opitz D. and Shavlik J., Generating accurate and diverse members of a Neural network ensemble. In David S. Touretzky, Michael C. Mozer, and Michael E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8.

[2]Gary M.Weiss, Bianca.Zadrozny, Maytal.Saar-Tsechansky, “Guest editorial: special issue on utility-based data mining”. Data Mining Knowledge Discovery 17:129–135, (2008).

[3]Raghavendra Patidar, Lokesh Sharma, “Credit Card Fraud Detection Using Neural Network”International Journal of Soft Computing and

Engineering (IJSCE), ISSN: 2231-2307, Volume-1, Issue-NCAI2011, June 2011.

[4]Y. Sahin and E. Duman, “Detecting Credit Card Fraud by Decision Trees and Support Vector Machines” , Proceedings of the International Multiconference of Engineers and Computer Scientists 2011 Vol I, IMECS 2011, March 16 -18, 2011,Hong Kong.

[5]K. R. Seeja and Masoumeh Zareapoor, “FraudMiner: A Novel Credit Card Fraud Detection Model Based on Frequent Itemset Mining”, Hindawi Publishing Corporation ,The Scientific World Journal , Volume 2014, Article ID 252797, 10 pages,http://dx.doi.org/10.1155/2014/252797.