Credit Card Fraud Detection with Machine Learning

credit card fraud detection

In this article, I will create a model for credit card fraud detection using machine learning predictive model Autoencoder and python.

Lets start with importing libraries

import pandas as pd
import numpy as np
import pickle
import matplotlib.pyplot as plt
from scipy import stats
import tensorflow as tf
import seaborn as sns
from pylab import rcParams
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import LogisticRegression
from sklearn.manifold import TSNE
from sklearn.metrics import classification_report, accuracy_score
from keras.models import Model, load_model
from keras.layers import Input, Dense
from keras.callbacks import ModelCheckpoint, TensorBoard
from keras import regularizers, Sequential
%matplotlib inline
sns.set(style='whitegrid', palette='muted', font_scale=1.5)
rcParams['figure.figsize'] = 14, 8
RANDOM_SEED = 42
LABELS = ["Normal", "Fraud"]

The data set I am going to use contains data about credit card transactions that occurred during a period of two days, with 492 frauds out of 284,807 transactions.

All variables in the data set are numerical. The data has been transformed using PCA transformation(s) due to privacy reasons.

The two features that haven’t been changed are Time and Amount. Time contains the seconds elapsed between each transaction and the first transaction in the data set.

Download the data set

You can download this data set from here – Credit card

df = pd.read_csv("creditcard.csv")
df.head()

Checking the shape of data

df.shape
(284807, 31)

Checking for null values

df.isnull().values.any()
False

There are no null values in the data.

Checking number of records of each kind of transaction class (Fraud and Non-Fraud)

count_classes = pd.value_counts(df['Class'], sort = True)
count_classes.plot(kind = 'bar', rot=0)
plt.title("Transaction class distribution")
plt.xticks(range(2), LABELS)
plt.xlabel("Class")
plt.ylabel("Frequency")

The data set is highly imbalanced. Looking at each of the fraud(1) and non-fraud(0) transactions.

frauds = df[df.Class == 1]
normal = df[df.Class == 0]
frauds.shape
(492, 31)
normal.shape
(284315, 31)

Since only 3 of the features (time, amount and Class) are non-anomyzed, let’s explore them.

Checking the amount of money involved in each kind of transaction

# Fraud transactions
frauds.Amount.describe()
count     492.000000
mean      122.211321
std       256.683288
min         0.000000
25%         1.000000
50%         9.250000
75%       105.890000
max      2125.870000
Name: Amount, dtype: float64
# Non-fraud transactions
normal.Amount.describe()
count    284315.000000
mean         88.291022
std         250.105092
min           0.000000
25%           5.650000
50%          22.000000
75%          77.050000
max       25691.160000
Name: Amount, dtype: float64

Graphical representation of Amount

f, (ax1, ax2) = plt.subplots(2, 1, sharex=True)
f.suptitle('Amount per transaction by class')

bins = 50

ax1.hist(frauds.Amount, bins = bins)
ax1.set_title('Fraud')

ax2.hist(normal.Amount, bins = bins)
ax2.set_title('Normal')

plt.xlabel('Amount ($)')
plt.ylabel('Number of Transactions')
plt.xlim((0, 20000))
plt.yscale('log')
plt.show()

Plotting time of transaction to check for correlations

f, (ax1, ax2) = plt.subplots(2, 1, sharex=True)
f.suptitle('Time of transaction vs Amount by class')

ax1.scatter(frauds.Time, frauds.Amount)
ax1.set_title('Fraud')

ax2.scatter(normal.Time, normal.Amount)
ax2.set_title('Normal')

plt.xlabel('Time (in Seconds)')
plt.ylabel('Amount')
plt.show()
fraud detection

The time does not seem to be a crucial feature in distinguishing normal vs fraud cases. Hence, I will drop it.

data = df.drop(['Time'], axis=1)

The numerical amount in fraud and normal cases differ highly, hence we scale them.

Scaling the Amount using StandardScaler

from sklearn.preprocessing import StandardScaler

data['Amount'] = StandardScaler().fit_transform(data['Amount'].values.reshape(-1, 1))

Building the model

We will be using autoencoders for the fraud detection model. Using autoencoders, we train the database only to learn the representation of the non-fraudulent transactions.

The reason behind applying this method is to let the model learn the best representation of non-fraudulent cases so that it automatically distinguishes the other case from it.

non_fraud = data[data['Class'] == 0] #.sample(1000)
fraud = data[data['Class'] == 1]

df = non_fraud.append(fraud).sample(frac=1).reset_index(drop=True)
X = df.drop(['Class'], axis = 1).values
Y = df["Class"].values

Spiting the data into 80% training and 20% testing

X_train, X_test = train_test_split(data, test_size=0.2, random_state=RANDOM_SEED)
X_train_fraud = X_train[X_train.Class == 1]
X_train = X_train[X_train.Class == 0]
X_train = X_train.drop(['Class'], axis=1)
y_test = X_test['Class']
X_test = X_test.drop(['Class'], axis=1)
X_train = X_train.values
X_test = X_test.values
X_train.shape

Autoencoder model

input_layer = Input(shape=(X.shape[1],))

## encoding part
encoded = Dense(100, activation='tanh', activity_regularizer=regularizers.l1(10e-5))(input_layer)
encoded = Dense(50, activation='relu')(encoded)

## decoding part
decoded = Dense(50, activation='tanh')(encoded)
decoded = Dense(100, activation='tanh')(decoded)

## output layer
output_layer = Dense(X.shape[1], activation='relu')(decoded)

Training the credit card fraud detection model

autoencoder = Model(input_layer, output_layer)
autoencoder.compile(optimizer="adadelta", loss="mse")

Scaling the values

x = data.drop(["Class"], axis=1)
y = data["Class"].values

x_scale = MinMaxScaler().fit_transform(x.values)
x_norm, x_fraud = x_scale[y == 0], x_scale[y == 1]

autoencoder.fit(x_norm[0:2000], x_norm[0:2000], 
                batch_size = 256, epochs = 10, 
                shuffle = True, validation_split = 0.20);
Train on 1600 samples, validate on 400 samples
Epoch 1/10
1600/1600 [==============================] - 0s 222us/step - loss: 0.6763 - val_loss: 0.4293
Epoch 2/10
1600/1600 [==============================] - 0s 15us/step - loss: 0.4016 - val_loss: 0.2752
Epoch 3/10
1600/1600 [==============================] - 0s 14us/step - loss: 0.2627 - val_loss: 0.1992
Epoch 4/10
1600/1600 [==============================] - 0s 14us/step - loss: 0.1949 - val_loss: 0.1585
Epoch 5/10
1600/1600 [==============================] - 0s 15us/step - loss: 0.1719 - val_loss: 0.1544
Epoch 6/10
1600/1600 [==============================] - 0s 14us/step - loss: 0.1702 - val_loss: 0.1567
Epoch 7/10
1600/1600 [==============================] - 0s 14us/step - loss: 0.1678 - val_loss: 0.1405
Epoch 8/10
1600/1600 [==============================] - 0s 14us/step - loss: 0.1493 - val_loss: 0.1440
Epoch 9/10
1600/1600 [==============================] - 0s 14us/step - loss: 0.1609 - val_loss: 0.1335
Epoch 10/10
1600/1600 [==============================] - 0s 13us/step - loss: 0.1408 - val_loss: 0.1326

Obtain the Hidden Representation

hidden_representation = Sequential()
hidden_representation.add(autoencoder.layers[0])
hidden_representation.add(autoencoder.layers[1])
hidden_representation.add(autoencoder.layers[2])

Model Prediction

norm_hid_rep = hidden_representation.predict(x_norm[:3000])
fraud_hid_rep = hidden_representation.predict(x_fraud)

Getting the representation data

rep_x = np.append(norm_hid_rep, fraud_hid_rep, axis = 0)
y_n = np.zeros(norm_hid_rep.shape[0])
y_f = np.ones(fraud_hid_rep.shape[0])
rep_y = np.append(y_n, y_f)

Train, test, split

train_x, val_x, train_y, val_y = train_test_split(rep_x, rep_y, test_size=0.25)

Credit Card Fraud Detection Prediction model

clf = LogisticRegression(solver="lbfgs").fit(train_x, train_y)
pred_y = clf.predict(val_x)

print ("")
print ("Classification Report: ")
print (classification_report(val_y, pred_y))

print ("")
print ("Accuracy Score: ", accuracy_score(val_y, pred_y))
Classification Report: 
              precision    recall  f1-score   support

         0.0       0.96      1.00      0.98       748
         1.0       1.00      0.76      0.86       125

    accuracy                           0.97       873
   macro avg       0.98      0.88      0.92       873
weighted avg       0.97      0.97      0.96       873


Accuracy Score:  0.9656357388316151

Also, read – 19 Machine Learning Interview Questions

Aman Kharwal
Aman Kharwal

I'm a writer and data scientist on a mission to educate others about the incredible power of data📈.

Articles: 1435

One comment

Leave a Reply