THE 3
rd
INTERNATIONAL SCIENTIFIC CONFERENCES OF STUDENTS AND YOUNG RESEARCHERS
dedicated to the 99
th
anniversary of the National Leader
of Azerbaijan Heydar Aliyev
110
companies to keep very secure system to prevent fraudulent and insecure
actions. The total number of the loss caused by frauds has tripled in the last
decade, for example, in 2021 this figure increased to more than $33 billion,
with 6.83 cents for every $100 [1]. Therefore, according to the industry
analysts, the cost of loss caused by the frauds
will experience a huge
increase. In order to prevent it, several rule-based systems are implemented
by the security teams to handle the fraudulent transactions. However, this is
not effective, as it requires more manual work and long-term processing and
can also be outsmarted by criminals with finding various ways. Hence, one
of the most effective and widely used method is to implement automatic fraud
detection algorithms. In order to protect customers from the digital
scammers, various frameworks such as
Machine and Deep Learning
techniques are widely used to identify the transaction anomalies.
Machine Learning is the combination of different computer algorithms
and statistical modelling to perform the tasks without hard coding because
the prediction will be made from the stored experimental knowledge. Deep
Learning techniques is a part of ML which covers various Neural Networks
(NN) that if trained properly, it can capture the hidden spots useful for the
prediction of frauds [2]. Using proper methods,
it is feasible to model the
transactional behavior of each customer according to the history, so that the
transaction can be classified. As the dataset is marked, supervised learning
techniques with appropriate hyperparameter tuning for the unbalanced
datasets are viable for his case. Dataset that has been used is provided by
Vesta corporation, but some columns are masked due to the security issues.
Different Machine and Deep Learning techniques will be passed from a grid
to find the best method according to their metrics because the target variable
is imbalanced. As our dataset is imbalanced, AUC (Area Under Curve) has
been used for evaluating the model performance, it measures the area under
the ROC curve. The higher AUC means the better distinguishability of the
model for the positive and negative cases. When this core is near to 1, it
means the model separability is quite good. X-axis depicts the false positive
rate, while Y-axis presents the true positive rate of the predictions [3].
𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 =
𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑡𝑖𝑣𝑒𝑠
𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
,
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 =
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑡𝑖𝑣𝑒𝑠
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒𝑠
The main purpose is not to define only the fraudulent transitions but the
fraud clients. Once the credit card owner has a fraud, their entire account
should be labelled as a fraud. The main logic here is to define the reported
fraudulent transactions and other transactions posterior to it.
The unique