Bank Fraud Project

Explore my work through images, videos and descriptions.

The idea of this project is simple, detect bank fraud, to do this I am using a dataset called "Bank Account Fraud Dataset Suite (NeurIPS 2022)".

In the beginning of this project I chose to go with an autoencoder this is because of my hardware limitation using an AMD GPU and the need to use Tensorflow-directML.

On the left is the first generated Confusion Matrix,

  • True Positives (Bottom-right, Fraud classified correctly): Fraudulent transactions classified correctly as fraud.

  • True Negatives (Top-left, Non-fraud classified correctly): Non-fraudulent transactions classified correctly as non-fraud.

  • False Positives (Top-right, Non-fraud classified as fraud): Non-fraudulent transactions misclassified as fraud.

  • False Negatives (Bottom-left, Fraud classified as non-fraud): Fraudulent transactions misclassified as non-fraud.

My current Model seems to classify non-fraud cases with high accuracy, but struggles to correctly identify fraud cases, with more false negatives (fraud misclassified as non-fraud). This could be due to the imbalance in fraud vs. non-fraud cases, even with SMOTE.

Potential Fix, analyze the other variation datasets included, if they have the same features then combining them all into one dataset may reduce the imbalances.

Bank Fraud Model #1

Links

Original Dataset license

Bank Account Fraud Dataset Suite (NeurIPS 2022)
https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022

Authors

Sérgio Jesus, José Pombal, Duarte Alves, André F. Cruz, Pedro Saleiro, Rita P. Ribeiro, João Gama, Pedro Bizarro