Indeed, two people can never have the same fingerprints, it is unique in every human. But using fingerprints we can classify gender, whether it’s male or female. In this article, I will take you through a Gender Classification Model which I will train using Deep Learning and Convolutional Neural Networks.
Now you must be thinking about how I will do Gender Classification. I will build a Convolutional Neural Network that can classify the gender using the fingerprints. For this task, I will use a dataset that contains over 55000 images of fingerprints of each finger.
Also, Read – Work on Artificial Intelligence Projects.
Gender Classification using Fingerprints
I will first start with converting those images into pixels value to extract the features from the fingerprints. Then I will split the data into training, testing and validation sets. Now let’s start with importing the libraries we need for this task and performing some steps of data preparation:
import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf
import os
import cv2
import matplotlib.pyplot as plt
Code language: Python (python)
Creating a Helper Function:
def extract_label(img_path,train = True):
filename, _ = os.path.splitext(os.path.basename(img_path))
subject_id, etc = filename.split('__')
if train:
gender, lr, finger, _, _ = etc.split('_')
else:
gender, lr, finger, _ = etc.split('_')
gender = 0 if gender == 'M' else 1
lr = 0 if lr == 'Left' else 1
if finger == 'thumb':
finger = 0
elif finger == 'index':
finger = 1
elif finger == 'middle':
finger = 2
elif finger == 'ring':
finger = 3
elif finger == 'little':
finger = 4
return np.array([gender], dtype=np.uint16)
Code language: Python (python)
The above function will help us in extracting the features from the fingerprints. This function will work by iterating through the labels of the images that we will assign in the function. The function will return an array 0 and 1. The zeros will represent males and ones will be representing the females.
Now our next step is to load the image path to the function we created to iterate all over the images to find labels, you can download the dataset from here:
img_size = 96
def loading_data(path,train):
print("loading data from: ",path)
data = []
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)
img_resize = cv2.resize(img_array, (img_size, img_size))
label = extract_label(os.path.join(path, img),train)
data.append([label[0], img_resize ])
except Exception as e:
pass
data
return data
Code language: Python (python)
Now, the next step is to assign various directories or folders to use the loading data function on all the images:
Real_path = "../input/aman/web/Real"
Easy_path = "../input/aman/web/Altered-Easy"
Medium_path = "../input/aman/web/Altered-Medium"
Hard_path = "../input/aman/web/Altered-Hard"
Easy_data = loading_data(Easy_path, train = True)
Medium_data = loading_data(Medium_path, train = True)
Hard_data = loading_data(Hard_path, train = True)
test = loading_data(Real_path, train = False)
data = np.concatenate([Easy_data, Medium_data, Hard_data], axis=0)
del Easy_data, Medium_data, Hard_data
Code language: Python (python)
Now let’s randomize the data and test the arrays to see what our data looks like:
import random
random.shuffle(data)
random.shuffle(test)
Code language: Python (python)
array([[1, array([[160, 158, 158, ..., 0, 0, 0], [160, 105, 121, ..., 0, 0, 0], [160, 105, 255, ..., 0, 0, 0], ..., [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0]], dtype=uint8)], [1, array([[160, 158, 158, ..., 0, 0, 0], [160, 105, 121, ..., 0, 0, 0], [160, 105, 255, ..., 0, 0, 0], ..., [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0]], dtype=uint8)], [1, array([[160, 158, 158, ..., 0, 0, 0], [160, 105, 121, ..., 0, 0, 0], [160, 105, 255, ..., 0, 0, 0], ..., [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0]], dtype=uint8)], ..., [0, array([[160, 158, 158, ..., 0, 0, 0], [160, 105, 121, ..., 0, 0, 0], [160, 105, 255, ..., 0, 0, 0], ..., [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0]], dtype=uint8)], [0, array([[160, 158, 158, ..., 0, 0, 0], [160, 105, 121, ..., 0, 0, 0], [160, 105, 255, ..., 0, 0, 0], ..., [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0]], dtype=uint8)], [1, array([[160, 158, 158, ..., 0, 0, 0], [160, 105, 121, ..., 0, 0, 0], [160, 105, 255, ..., 0, 0, 0], ..., [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0], [ 0, 0, 0, ..., 0, 0, 0]], dtype=uint8)]], dtype=object)
The output above is showing how our images are present in the data. The first array is representing the label value of 0 and then we have an array of pixel values. Don’t get confused, the above array is the real representation of how the computer sees an image. For example:

Now I will split the image arrays and image labels:
img, labels = [], []
for label, feature in data:
labels.append(label)
img.append(feature)
train_data = np.array(img).reshape(-1, img_size, img_size, 1)
train_data = train_data / 255.0
Code language: Python (python)
Building CNN for Gender Classification Model
Now, here I will build a Convolutional Neural Network for Gender classification model:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
from tensorflow.keras import layers
from tensorflow.keras import optimizers
model = Sequential([
Conv2D(32, 3, padding='same', activation='relu',kernel_initializer='he_uniform', input_shape = [96, 96, 1]),
MaxPooling2D(2),
Conv2D(32, 3, padding='same', kernel_initializer='he_uniform', activation='relu'),
MaxPooling2D(2),
Flatten(),
Dense(128, kernel_initializer='he_uniform',activation = 'relu'),
Dense(2, activation = 'softmax'),
])
Code language: Python (python)
Now, before moving forward, let’s have a quick look at the summary of our CNN Model:
model.summary()
Code language: Python (python)
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 96, 96, 32) 320 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 48, 48, 32) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 48, 48, 32) 9248 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 24, 24, 32) 0 _________________________________________________________________ flatten (Flatten) (None, 18432) 0 _________________________________________________________________ dense (Dense) (None, 128) 2359424 _________________________________________________________________ dense_1 (Dense) (None, 2) 258 ================================================================= Total params: 2,369,250 Trainable params: 2,369,250 Non-trainable params: 0
Now, I will compile the model using Adam optimizers, with a learning rate of 10%, and to prevent our CNN model from overfitting, I will be using the early_stopping_call method:
model.compile(optimizer = optimizers.Adam(1e-3), loss = 'categorical_crossentropy', metrics = ['accuracy'])
early_stopping_cb = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)
Code language: Python (python)
Now, let’s fit our Gender Classification Model, We are going to train the model for 30 epochs.
history = model.fit(train_data, train_labels, batch_size = 128, epochs = 30,
validation_split = 0.2, callbacks = [early_stopping_cb], verbose = 1)
Code language: Python (python)
You will see a very long output for this of 30 epochs. It will use GPU, So I will recommend you to use Google colab for better performance. It will give train loss and accuracy and also validation loss and accuracy per epochs.
Now, we have successfully trained our CNN for Gender Classification Model with a training accuracy of 99 per cent and Validation accuracy of 98 per cent. Now, let’s have a look at the performance of our Gender Classification Model:
import pandas as pd
import matplotlib.pyplot as plt
pd.DataFrame(history.history).plot(figsize = (8,5))
plt.grid(True)
plt.gca().set_ylim(0,1)
Code language: Python (python)

Testing Gender Classification Model
As we split the training images into image labels and image arrays while training our model, we need to repeat the same process on the test images:
test_images, test_labels = [], []
for label, feature in test:
test_images.append(feature)
test_labels.append(label)
test_images = np.array(test_images).reshape(-1, img_size, img_size, 1)
test_images = test_images / 255.0
del test
test_labels = to_categorical(test_labels, num_classes = 2)
Code language: Python (python)
Now, let’s evaluate the performance of our CNN model on the test set:
model.evaluate(test_images, test_labels)
6000/6000 [==============================] - 1s 134us/sample - loss: 0.0231 - accuracy: 0.9972
[0.023149466800407026, 0.9971667]
Code language: Python (python)
So, we got an accuracy of 99.72 per cent and a loss value of 0.0126 on our gender classification model. I hope you liked this article, feel free to ask your valuable questions in the comments section below. Don’t forget to subscribe for my daily newsletters if you like my work.