0% found this document useful (0 votes)

50 views

Report On Credit Card Fraud Detection Algo Using Machine Learning 1

The document discusses using machine learning algorithms like Random Forest to detect credit card fraud. It describes how fraud occurs and challenges in detecting fraud using machine learning. The main goal is to identify fraud using algorithms and reduce false alarm rates. Random Forest algorithm is explored as one approach for credit card fraud detection.

Uploaded by

Atharva Gokhare

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views

Report On Credit Card Fraud Detection Algo Using Machine Learning 1

Uploaded by

Atharva Gokhare

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

A Seminar Report on

CREDIT CARD FRAUD DETECTION USING MACHINE

ALGORITHM (Random Forest)

Submitted in partial fulfillment of the requirements for the degree of

BACHELOR OF ENGINEERING
In
COMUTER ENGINEERING

Submitted By
Atharva Nitin Gokhare
20CO028

Under the Guidance of

Mrs. Minal Swami
Department of Computer Engineering
ALL INDIA SHRI SHIVAJI MEMORIAL SOCIETY’S COLLEGE
OF ENGINEERING
PUNE-411001
Academic Year: 2022-23(Term-I)
Savitribai Phule Pune University

1|Page
DEPARTMENT OF COMPUTER ENGINEERING

CERTIFICATE

This is to certify that Atharva Nitin Gokhare from Third Year

Computer Engineering has successfully completed his seminar work
titled “CREDIT CARD FRAUD DETECTION USING
MACHINE ALGORITHM (Random Forest” At AISSMS
College of Engineering, Pune in the partial fulfillment of the bachelor's
degree in Engineering.

Mrs. Minal Swami Dr. S. V. Athawale Dr. D. S. Bormane

Seminar Guide HOD Principal

Computer Engineering Computer Engineering AISSMS COE, Pune

2|Page
SEMINAR APPROVAL
The Seminar entitled

CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING

ALGORITHM (Random Forest Algo)

ATHARVA NITIN GOKHARE

(20CO028)

is approved for the degree of

Bachelor of Engineering

Computer Engineering

Examiner 1 Examiner 2
Name and Signature Name and Signature

Date: -

Place: -

3|Page
ABSTRACT

People can use credit cards to make purchases online since they offer
a convenient and effective method. Credit card misuse is now more
likely because of increased credit card use. Both the owners of credit
cards and financial institutions suffer large financial losses because
of credit card theft. The primary goal of this research study is to
identify such frauds, which includes high-class data imbalance, data
accessibility, changes in fraud nature, and high false alarm rates.
Many machine learning-based algorithms for credit card detection
are presented in the pertinent literature which is the Random Forest
Algorithm (RFA). However, due to poor precision, it is still
necessary to use state-of-the-art.

4|Page
ACKNOWLEDGEMENT

It gives me great pleasure to acknowledge the contribution of all those who have directly or
indirectly contributed to the completion of this seminar.

I express my foremost and deepest gratitude to my guide Mrs. Minal Swami for her
supervision, noble guidance and constant encouragement in carrying out the seminar.

I am deeply indebted to Dr. S V Athavale, Head, Computer Engineering Department, for

his time-to-time advice and support and for providing all the necessary facilities available in
the Institute.

Acknowledgement will not be completed if I forget to mention special thanks to all the
teaching and non-teaching staff of the Computer Engineering Department for rendering
support directly or indirectly.

Academic Year: 2022-23

Date-

5|Page
CONTENTS
Certificate

Seminar Approval

Abstract

Acknowledgements

Contents

List of Figures

Introduction

1 SUPERVISED MACHINE LEARNING APPROACHES

2 Related work

3 Research methodology

4 Feature Selection

5 RANDOM FOREST – The machine learning Algorithm

6 How does Random Forest algorithm work?

7 Applications of Random Forest

8 Advantages and Disadvantages

9 Python Implementation of Random Forest Algorithm

10 RFA IMPLEMENTATION IN CREDIT CARD FRAUD DETECTION

11 SYSTEM ARCHITECTURE

12 Implementation Modules

13 Future Enhancements

14 Conclusion

15 References

6|Page
CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING
ALGORITHM

Introduction

'Fraud' in credit card transactions is unauthorized and unwanted usage of an account by

someone other than the owner of that account. Necessary prevention measures can be taken
to stop this abuse and the behavior of such fraudulent practices can be studied to minimize it
and protect against similar occurrences in the future. In other words, Credit Card Fraud can
be defined as a case where a person uses someone else’s credit card for personal reasons
while the owner and the card issuing authorities are unaware of the fact that the card is being
used. Fraud detection involves monitoring the activities of populations of users to estimate,
perceive or avoid objectionable behavior, which consists of fraud, intrusion, and defaulting.
This is a very relevant problem that demands the attention of communities such as machine
learning and data science where the solution to this problem can be automated. This problem
is particularly challenging from the perspective of learning, as it is characterized by various
factors such as class imbalance.

The number of valid transactions far outnumber fraudulent ones. Also, the transaction
patterns often change their statistical properties over the course of time. These are not the
only challenges in the implementation of a real-world fraud detection system, however. In
real world examples, the massive stream of payment requests is quickly scanned by
automatic tools that determine which transactions to authorize. Machine learning algorithms
are employed to analyze all the authorized transactions and report suspicious ones. These
reports are investigated by professionals who contact the cardholders to confirm if the
transaction was genuine or fraudulent. The investigators provide feedback to the automated
system which is used to train and update the algorithm to eventually improve the fraud-
detection performance over time.ML models have been used in many studies to solve
numerous challenges. Deep learning algorithms applied applications in computer network,
intrusion detection, banking, insurance, mobile cellular networks, health care fraud
detection, medical and malware detection, detection for video surveillance, location
tracking,

7|Page
Android malware detection, home automation, and heart disease prediction. We explore the
practical application of ML, particularly DL algorithms, to identify credit card thefts in the
banking industry in this paper. For data categorization challenges, the support vector
machine (SVM) is a supervised ML technique. It is employed in a variety of domains,
including image recognition, credit rating, and public safety. SVM can tackle linear and
nonlinear binary classification problems, and it finds a hyperplane that separates the input
data in the support vector, which is superior to other classifiers. Neural networks were the
first method used to identify credit card theft in the past. As a result, (DL), a branch of ML,
is currently focused on DL approaches. In recent years, deep learning approaches have
received significant attention due to substantial and promising outcomes in various
applications, such as computer vision, natural language processing, and voice. However,
only a few studies have examined the application of deep neural networks in identifying
CCF. It uses several deep learning algorithms for detecting CCF. However, in this study, we
choose the CNN model and its layers to determine if the original fraud is the normal
transaction of qualified datasets. Some transactions are common in datasets that have been
labelled fraudulent and demonstrate questionable transaction behavior. As a result, we focus
on supervised and unsupervised learning in this research paper. The class imbalance is the
problem in ML where the total number of a class of data (positive) is far less than the total
number of another class of data (negative). The classification challenge of the unbalanced
dataset has been the subject of several studies. An extensive collection of studies can
provide several answers.

Therefore, to the best of our knowledge, the problem of class imbalance has not yet been
solved. We propose to alter the DL algorithm of the CNN model by adding additional layers
for features extraction and the classification of credit card transactions as fraudulent or
otherwise. The top attributes from the prepared dataset are ranked using feature selection
techniques. After that, CCF is classified using several supervised machine-driven and deep
learning models. In this study, the main aim is to detect fraudulent transactions using credit
cards with the help of ML algorithms and deep learning algorithms. This study makes the
following contributions: • Feature selection algorithms are used to rank the top features from
the CCF transaction dataset, which help in class label predictions. • The deep learning model
is proposed by adding a number of additional layers that are then used to extract the features
and classification from the credit card farad detection dataset. • To analyze the performance
CNN model, apply different architecture of CNN layers. • To perform a comparative

8|Page
analysis between ML with DL algorithms and proposed CNN with baseline model, the
results prove that the proposed approach outperforms existing approaches. • To assess the
accuracy of the classifiers, performance evaluation measures, accuracy, precision, and recall
are used. Experiments are performed on the latest credit cards dataset. The rest of the paper
is structured as follows: The second section examines the related works. It also shows the
outcomes of our tests on a real dataset, as well as the analysis

9|Page
1.SUPERVISED MACHINE LEARNING APPROACHES
ML has many branches, and each branch can deal with different learning tasks. However,
ML learning has different framework types. The ML approach provides a solution for CCF,
such as random forest (RF). The ensemble of the decision tree is the random forest. Most
researchers use the RF approach. To combine the model, we can use (RF) along with
network analysis. This method is called APATE. Researchers can use different ML
techniques, such as supervised learning and unsupervised techniques. ML algorithms, such
as LR, ANN, DT, SVM and NB, are commonly used for CCF detection. The researcher can
combine these techniques with ensemble techniques to construct solid detection classifiers.
The linking of multiple neurons and nodes is known as an artificial neural network. A feed-
forward perceptron multilayer is built up of numerous layers: an input layer, an output layer
and one or more hidden layers. For the representation of the exploratory variables, the first
layer contains the input nodes. With a precise weight, these input layers are multiplied, and
each of the hidden layer nodes is transferred with a certain bias, and they are added together.
An activation function is then applied to create the output of each neuron for this
summation, which is then transferred to the next layer. Finally, the algorithm’s reply is
provided by the output layer. The first set randomly used weights and formerly used the
training set to minimise the error. All these weights were adjusted by detailed algorithms
such as backpropagation. The graphic model for contingency relationships between a set of
variables is called the Bayesian belief network. The independence assumption in naïve
Bayes TABLE 1. Algorithms of machine learning and their accuracy. is that it was
developed to relax and allow for dependencies among variables. Variable quantity is
characterised as nodes, although dependencies of conditions between variables are shown as
arcs between nodes. The conditional probability table of each node is linked, which makes
the possibilities of the node’s variable conditional on the parent’s node values. The
computational system of the bilateral-branch network (BBN) is as follows: Finding a
construction for the network is the first step: it was raised by human experts, which may be
conditional on the specific algorithms by using the data. When this network topology
originates, straightforwardly fitting the network uses antique data in naïve Bayes so that the
constant variables are also discretised and supposedly distributed normally.
Correspondingly, in BBN, it is expected that each node is autonomous of its no offspring,
assuming its maternities in the graph. This is acknowledged as the condition of Markov. The
linear classification model is a support vector machine (SVM) and problems of regression.
Rendering to the SVM algorithm, we can find the points closest to the line from both
classes. These points are called support vectors. This report is concerned with the integration
of unsupervised techniques with supervised techniques for the classification of CCF
detection.

10 | P a g e
2. Related work

People implemented a credit card fraud detection system using several ML

algorithms including logistic regression (LR), decision tree (DT), support
vector machine (SVM) and random forest (RF). These classifiers were
evaluated using a credit card fraud detection dataset generated from
European cardholders in 2013. In this dataset, the ratio between non-
fraudulent and fraudulent transactions is highly skewed; therefore, this is
a highly imbalanced dataset. The researcher used classification accuracy to
assess the performance of each ML approach. The experimental outcomes
showed that the LR, DT, SVM and RF obtained the following accuracy
scores: 97.70%, 95.50%, 97.50% and 98.60%, respectively. Although these
outcomes are good, the authors suggested that the implementation of
advanced pre-processing techniques could have a positive impact on the
performance of the classifiers.

Someone proposed a credit card fraud detection method using ML. The
authors used a credit card fraud dataset sourced from Kaggle. This dataset
contains transactions made within 2 days by European credit card holders.
To deal with the class imbalance problem present in the dataset, the
researcher implemented the Synthetic Minority Oversampling Technique
(SMOTE) oversampling technique. The following ML methods were
implemented to assess the efficacy of the proposed method: RF, NB, and
multilayer perceptron (MLP). The experimental results demonstrated that
the RF algorithm performed optimally with a fraud detection accuracy of
99.96%. The NB and the MLP methods obtained accuracy scores of

11 | P a g e
99.23% and 99.93%, respectively. The authors concede that more research
should be conducted to implement a feature selection method that could
improve the accuracy of other ML methods.

Someone conducted a performance analysis of ML techniques for credit

card fraud detection. In this research, the authors considered the following
ML approaches: DT, k-Nearest Neighbor (KNN), LR, RF and NB. To assess
the performance of each ML method, the authors used a highly imbalanced
dataset that was generated from European cardholders. One of the main
performance metrics that was used in the experiments is the precision
which was obtained by each classifier. The experimental outcomes showed
that the DT, KNN, LR, and RF obtained precisions of 85.11%, 91.11%,
87.5%, 89.77%, 6.52%, respectively.

Someone presented a comparison analysis of different ML methods on the

European cardholder's credit card fraud dataset. In this research, the
authors used a hybrid sampling technique to deal with the imbalanced
nature of the dataset. The following ML were considered: NB, KNN, and
LR. The experiments were carried out using a Python based ML
framework. Accuracy was the main performance metric that was utilized to
assess the effectiveness of each ML approach. The experimental results
demonstrated that the NB, LR,and KNN achieved the following accuracies,
respectively: 97.92%, 54.86%, and 97.69%. Although the NB and KNN
performed relatively well, the authors did not explore the possibility to
implement a feature selection method.

People utilized several ML learning-based methods to solve the issue of

credit card fraud. In this work, the researchers used the European credit
cardholder fraud dataset. To deal with the highly imbalanced nature of this
dataset, the authors employed the SMOTE sampling technique. The
12 | P a g e
following ML methods were considered: DT, LR, and Isolation Forest (IF).
Accuracy was one of the main performance metrics that was considered.
The results showed that the DT, LR, and IF obtained accuracy scores of
97.08%, 97.18%, and 58.83%, respectively.

Someone implemented an intelligent payment card fraud detection system

using the GA for feature selection and aggregation. The authors
implemented several machine learning algorithms to validate the
effectiveness of their proposed method. The results demonstrated that the
GA-RF obtained an accuracy of 77.95%, the GA-ANN achieved an accuracy
of 81.82%, and the GA-DT attained an accuracy of 81.97%.

13 | P a g e
3.Research methodology
Dataset
In this research, we use a dataset that includes credit card transactions
that were made by European cardholders for 2 days in September 2013.
This dataset contains 284807 transactions in total in which 0.172% of the
transactions are fraudulent. The dataset has the following 30 features (V1,,
V28), Time and Amount. All the attributes within the dataset are
numerical. The last column represents the class (type of transaction)
whereby the value of 1 denotes a fraudulent transaction and the value of 0
otherwise. The features V1 to V28 are not named for data security and
integrity reasons. This dataset has been used and one of the key issues that
we discovered is the low detection accuracy score that was obtained by
those models because of the highly imbalanced nature of the dataset. In
order to solve the issue of class imbalance, we applied the Synthetic
Minority Oversampling Technique (SMOTE) method in the Data-
Preprocessing phase of the proposed framework. The SMOTE method
works by picking samples that are close to each other within the feature
space, drawing a line between the data points in the feature space and
creating a new instance of the minority class at a point along the line.

4.Feature selection

Feature selection (FS) is a crucial step when implementing machine

learning methods. This is partly because the dataset used during the
training and testing processes may have a large feature space that may
negatively impact the overall performance of the models. The choice of
which FS method to use depends on the kind of problem a researcher is
trying to solve. The following paragraph provides an overview of instances
where using a FS method improved on the performance of ML models.

14 | P a g e
5. RANDOM FOREST – The machine learning Algorithm

Random forests are an ensemble method used for classification. In Random Forest, we grow
multiple trees as opposed to a single tree in Decision Tree model. But the question arises
why to use multiple trees when the same work can be done by a single tree as well. One of
the major problems of Decision Tree is overfitting which gives us a very bad predictive
model and adding multiple trees in the random introduces randomness which in turn gets rid
of overfitting and gives us a very superior predictive model. To classify a new object based
on attributes, each tree gives a classification, and we say the tree “votes” for that class. The
forest chooses the classification having the most votes (over all the trees in the forest) and in
case of regression, it takes the average of outputs by different trees. Random Forest is also
called Random Decision Forest (RFA) which is used for Classification, Regression and
other tasks that are performed by constructing multiple decision trees. This Random Forest
Algorithm is based on supervised learning and the major advantage of this algorithm is that
it can be used for both Classification and Regression. Random Forest Algorithm gives you
better accuracy when compared with all other existing systems and this is the most
commonly used algorithm. In this paper the use of Random Forest algorithm in credit card
fraud detection can give you accuracy of about 90 to 95%.

15 | P a g e
6. How does Random Forest algorithm work?
Random Forest works in two-phases: first is to create the random forest by combining N
decision tree, and second is to make predictions for each tree created in the first phase.

The Working process can be explained in the below steps and diagram:

Step-1: Select random K data points from the training set.

Step-2: Build the decision trees associated with the selected data points (Subsets).

Step-3: Choose the number N for decision trees that you want to build.

Step-4: Repeat Step 1 & 2.

Step-5: For new data points, find the predictions of each decision tree, and assign the new
data points to the category that wins the majority votes.

The working of the algorithm can be better understood by the below example:

Example: Suppose there is a dataset that contains multiple fruit images. So, this dataset is
given to the Random Forest classifier. The dataset is divided into subsets and given to each
decision tree. During the training phase, each decision tree produces a prediction result,

16 | P a g e
and when a new data point occurs, then based on most results, the Random Forest
classifier predicts the final decision. Consider the below image:

17 | P a g e
7. Applications of Random Forest
There are mainly four sectors where Random Forest mostly used:

1. Banking: Banking sector mostly uses this algorithm for the identification of loan
risk.
2. Medicine: With the help of this algorithm, disease trends and risks of the disease
can be identified.
3. Land Use: We can identify the areas of similar land use by this algorithm.
4. Marketing: Marketing trends can be identified using this algorithm.

8.Advantages of Random Forest

• Random Forest can perform both Classification and Regression tasks.
• It is capable of handling large datasets with high dimensionality.
• It enhances the accuracy of the model and prevents the overfitting issue.

Disadvantages of Random Forest

• Although random forest can be used for both classification and regression tasks, it is
not more suitable for Regression tasks.

9. Python Implementation of Random Forest Algorithm

Now we will implement the Random Forest Algorithm tree using Python. For this, we will
use the same dataset "user_data.csv", which we have used in previous classification
models. By using the same dataset, we can compare the Random Forest classifier with
other classification models such as Decision tree Classifier, KNN, SVM, Logistic Regression,
etc.

18 | P a g e
10. RFA IMPLEMENTATION IN CREDIT CARD
FRAUD DETECTION
In credit card fraud detection, the Random Forest Algorithm gives better accuracy in
results. First all the datasets will be collected and analyzed. During analysis process all the
duplicate values and the null values will be removed from the dataset. Now the dataset will
be preprocessed based on the amount and transaction time for finding the accuracy of the
resultant dataset. After the preprocessing of dataset into amount and transaction time
now the dataset will be divided into two categories. The dataset is classified in two
categories as trained data and test dataset. Here for dataset classification, we use a
software called ‘Scikit-learn’. Scikit-learn is a free software for machine learning library in
python where it contains features like classification, regression, Clustering algorithms and
various algorithms to interoperate with Python. After the preprocessing of the dataset now
we apply the Random Forest Algorithm. By applying Random Forest Algorithm, the
preprocessed dataset will be analyzed again and then a confusion matrix will be obtained.
In confusion matrix the dataset will be partitioned into four blocks as True Positive (Positive
(TP), True Negative (TN), False Positive (FP) and False Negative (FN). Now the dataset will
be partitioned continuously until all the data is validated. Now all these partitioned data
will be evaluated and finally it will be represented as separate graphs. These separate
graphs will give only less accuracy about the resultant dataset. So, to obtain better
accuracy we use Random Forest Algorithm where it takes all the graph values and gives us
only necessary values with better accuracy when compared with all other algorithms.

19 | P a g e
11. SYSTEM ARCHITECTURE
In our architecture first we have a credit card dataset where this contains all the details
about credit cards. But here we take only Amount and Transaction time for analysis and
preprocessing of dataset. The next step is the process of data cleaning where the dataset
will be analyzed, and all the duplicate and null values will be eliminated from the dataset
taken. The next step is the data partition where the credit card dataset will be partitioned
into two partitions as trained dataset and testing dataset. After that Random Forest
Algorithm will be applied and a confusion matrix will be obtained. Now the performance
analysis will be done on the obtained confusion matrix. This Performance analysis will give
the accuracy of about 90% in this credit card fraud detection system.

20 | P a g e
21 | P a g e
22 | P a g e
23 | P a g e
13.FUTURE ENHANCEMENT
While we couldn’t reach our goal of 100% accuracy in fraud detection, we did
end up creating a system that can, with enough time and data, get very close
to that goal. As with any such project, there is some room for improvement
here. The very nature of this project allows for multiple algorithms to be
integrated together as modules and their results can be combined to increase
the accuracy of the result. This model can further be improved with the
addition of more algorithms into it. However, the output of these algorithms
needs to be in the same format as the others. Once that condition is satisfied,
the modules are easy to add as done in the code. This provides a great degree
of modularity and versatility to the project. More room for improvement can
be found in the dataset. As demonstrated before, the precision of the
algorithms increases when the size of dataset is increased. Hence, more data
will surely make the model more accurate in detecting frauds and reduce the
number of false positives. However, this requires official support from the
banks themselves.

24 | P a g e
14.CONCLUSION
Credit card fraud is without a doubt an act of criminal dishonesty. This article
has listed out the most common methods of fraud along with their detection
methods and reviewed recent findings in this field. This paper has also
explained in detail how machine learning can be applied to get better results
in fraud detection along with the algorithm, pseudocode, explanation its
implementation and experimentation results. While the algorithm does reach
over 99.6% accuracy, its precision remains only at 28% when a tenth of the
data set is taken into consideration. However, when the entire dataset is fed
into the algorithm, the precision rises to 33%. This high percentage of
accuracy is to be expected due to the huge imbalance between the number of
valid and the number of genuine transactions.

25 | P a g e
15.References
• [1] Y. Abakarim, M. Lahby, and A. Attioui, ‘‘An efficient real time model for credit card fraud
detection based on deep learning,’’ in Proc. 12th Int. Conf. Intell. Systems: Theories Appl.,
Oct. 2018, pp. 1–7, doi: 10.1145/3289402.3289530.
• [2] H. Abdi and L. J. Williams, ‘‘Principal component analysis,’’ Wiley Interdiscipl. Rev.,
Comput. Statist., vol. 2, no. 4, pp. 433–459, Jul. 2010, doi: 10.1002/wics.101.
• [3] V. Arora, R. S. Leekha, K. Lee, and A. Kataria, ‘‘Facilitating user authorization from
imbalanced data logs of credit cards using artificial intelligence,’’ Mobile Inf. Syst., vol.
2020, pp. 1–13, Oct. 2020, doi: 10.1155/2020/8885269.
• [4] A. O. Balogun, S. Basri, S. J. Abdulkadir, and A. S. Hashim, ‘‘Performance analysis of
feature selection methods in software defect prediction: A search method approach,’’ Appl.
Sci., vol. 9, no. 13, p. 2764, Jul. 2019, doi: 10.3390/app9132764.
• [5] B. Bandaranayake, ‘‘Fraud and corruption control at education system level: A case
study of the Victorian department of education and early childhood development in
Australia,’’ J. Cases Educ. Leadership, vol. 17, no. 4, pp. 34–53, Dec. 2014, doi:
10.1177/1555458914549669.
• [6] J. Błaszczyński, A. T. de Almeida Filho, A. Matuszyk, M. Szelg¸, and R. Słowiński, ‘‘Auto
loan fraud detection using dominance-based rough set approach versus machine learning
methods,’’ Expert Syst. Appl., vol. 163, Jan. 2021, Art. no. 113740, doi:
10.1016/j.eswa.2020.113740.
• [7] B. Branco, P. Abreu, A. S. Gomes, M. S. C. Almeida, J. T. Ascensão, and P. Bizarro,
‘‘Interleaved sequence RNNs for fraud detection,’’ in Proc. 26th ACM SIGKDD Int. Conf.
Knowl. Discovery Data Mining, 2020, pp. 3101–3109, doi: 10.1145/3394486.3403361.
• [8] F. Cartella, O. Anunciacao, Y. Funabiki, D. Yamaguchi, T. Akishita, and O. Elshocht,
‘‘Adversarial attacks for tabular data: Application to fraud detection and imbalanced data,’’
2021, arXiv:2101.08030.

26 | P a g e
• [9] S. S. Lad, I. Dept. of CSERajarambapu Institute of
TechnologyRajaramnagarSangliMaharashtra, and A. C. Adamuthe, ‘‘Malware classification
with improved convolutional neural network model,’’ Int. J. Comput. Netw. Inf. Secur., vol.
12, no. 6, pp. 30–43, Dec. 2021, doi: 10.5815/ijcnis.2020.06.03.
• [10] V. N. Dornadula and S. Geetha, ‘‘Credit card fraud detection using machine learning
algorithms,’’ Proc. Comput. Sci., vol. 165, pp. 631–641, Jan. 2019, doi:
10.1016/j.procs.2020.01.057.
• [11] I. Benchaji, S. Douzi, and B. E. Ouahidi, ‘‘Credit card fraud detection model based on
LSTM recurrent neural networks,’’ J. Adv. Inf. Technol., vol. 12, no. 2, pp. 113–118, 2021,
doi: 10.12720/jait.12.2.113-118.
• [12] Y. Fang, Y. Zhang, and C. Huang, ‘‘Credit card fraud detection based on machine
learning,’’ Comput., Mater. Continua, vol. 61, no. 1, pp. 185–195, 2019, doi:
10.32604/cmc.2019.06144.
• [13] J. Forough and S. Momtazi, ‘‘Ensemble of deep sequential models for credit card fraud
detection,’’ Appl. Soft Comput., vol. 99, Feb. 2021, Art. no. 106883, doi:
10.1016/j.asoc.2020.106883.
• [14] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image recognition,’’
2015, arXiv:1512.03385.
• [15] X. Hu, H. Chen, and R. Zhang, ‘‘Short paper: Credit card fraud detection using LightGBM
with asymmetric error control,’’ in Proc. 2nd Int. Conf. Artif. Intell. for Industries (AII), Sep.
2019, pp. 91–94, doi: 10.1109/AI4I46381.2019.00030.
• [16] J. Kim, H.-J. Kim, and H. Kim, ‘‘Fraud detection for job placement using hierarchical
clusters-based deep neural networks,’’ Int. J. Speech Technol., vol. 49, no. 8, pp. 2842–
2861, Aug. 2019, doi: 10.1007/s10489-019-01419-2.

• https://www.javatpoint.com/machine-learning-random-forest-
algorithm
• https://journalofbigdata.springeropen.com/articles/10.1186/s40537-
022-00573-8
• https://www.indeed.com/career-advice/career-development/what-is-
resampling

27 | P a g e
• https://www.researchgate.net/publication/336800562_Credit_Card_Fr
aud_Detection_using_Machine_Learning_and_Data_Science
• https://en.wikipedia.org//wiki/Random_forest
• https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9755930
• https://edu.authorcafe.com/academies/7920/a-report-on-decision-
tree-random-forest-and-deep-forest

28 | P a g e