Birds inspired us to fly, nature inspired us to countless inventions. It seems logical, then to look at the brain’s architecture for inspiration on how to build an Intelligent Machine. This is the logic that sparked Artificial Neural Networks (ANN). ANN is a Machine Learning Model inspired by the networks of biological neurons found in our brains. However, although planes were inspired by birds, they don’t have to flap their wings. Similarly, ANN have gradually become quite different from their biological cousins. In this Article, I will build an Image Classification model with ANN to show you how ANN works.
Building an Image Classification with ANN
First, we need to load a dataset. In this Image Classification model we will tackle Fashion MNIST. It has a format of 60,000 grayscale images of 28 x 28 pixels each, with 10 classes. Let’s import some necessary libraries to start with this task:
# Python ≥3.5 is required
import sys
assert sys.version_info >= (3, 5)
# Scikit-Learn ≥0.20 is required
import sklearn
assert sklearn.__version__ >= "0.20"
try:
# %tensorflow_version only exists in Colab.
%tensorflow_version 2.x
except Exception:
pass
# TensorFlow ≥2.0 is required
import tensorflow as tf
assert tf.__version__ >= "2.0"
# Common imports
import numpy as np
import os
# to make this notebook's output stable across runs
np.random.seed(42)
# To plot pretty figures
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)
Code language: Python (python)
Using Keras to Load the Dataset
Keras provide some quality functions to fetch and load common datasets, including MNIST, Fashion MNIST, and the California housing dataset. Let’s start by loading the fashion MNIST dataset to create an Image Classification model.
Keras has a number of functions to load popular datasets in keras.datasets
. The dataset is already split for you between a training set and a test set, but it can be useful to split the training set further to have a validation set:
import tensorflow as tf
from tensorflow import keras
fashion_mnist = keras.datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()
Code language: Python (python)
When loading MNIST or Fashion MNIST using Keras rather than Scikit-Learn, one important difference is that every image is represented as a 28 x 28 array rather than a 1D array of size 784. Moreover, the pixel intensities are represented as integers rather than the floats. Let’s take a look at the shape and data type of the training set:
X_train_full.shape
Code language: Python (python)
(60000, 28, 28)
X_train_full.dtype
Code language: Python (python)
dtype(‘uint8’)
Note that the dataset is already split into a training set and a test set, but there is no validation set, so we’ll create one now. Additionally, since we are going to train the ANN using Gradient Descent, we must scale the input features. For simplicity, I will scale the pixel intensities down to the 0-1 range by dividing them by 255.0:
X_valid, X_train = X_train_full[:5000] / 255., X_train_full[5000:] / 255.
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
X_test = X_test / 255.
Code language: Python (python)
You can plot an image using Matplotlib’s imshow()
function, with a 'binary'
color map:
plt.imshow(X_train[0], cmap="binary")
plt.axis('off')
plt.show()
Code language: Python (python)

The labels are the class IDs (represented as uint8), from 0 to 9:
y_train
Code language: Python (python)
array([4, 0, 7, …, 3, 0, 5], dtype=uint8)
With MNIST, when the label is equal to 5, it means that the image represents the handwritten digit 5. easy. For Fashion MNIST, however, we need the list of class names to know what we are dealing with:
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
Code language: Python (python)
For example, the first image in the training set represents a coat:
class_names[y_train[0]]
Code language: Python (python)
Coat
The validation set contains 5,000 images, and the test set contains 10,000 images:
X_valid.shape
Code language: Python (python)
(5000, 28, 28)
X_test.shape
Code language: Python (python)
(10000, 28, 28)
Let’s take a look at a sample of the images in the dataset:
n_rows = 4
n_cols = 10
plt.figure(figsize=(n_cols * 1.2, n_rows * 1.2))
for row in range(n_rows):
for col in range(n_cols):
index = n_cols * row + col
plt.subplot(n_rows, n_cols, index + 1)
plt.imshow(X_train[index], cmap="binary", interpolation="nearest")
plt.axis('off')
plt.title(class_names[y_train[index]], fontsize=12)
plt.subplots_adjust(wspace=0.2, hspace=0.5)
save_fig('fashion_mnist_plot', tight_layout=False)
plt.show()
Code language: Python (python)

Image Classification Model using Sequential API
Now, let’s build the neural network. Here is a classification MLP with two hidden layers:
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28, 28]))
model.add(keras.layers.Dense(300, activation="relu"))
model.add(keras.layers.Dense(100, activation="relu"))
model.add(keras.layers.Dense(10, activation="softmax"))
keras.backend.clear_session()
np.random.seed(42)
tf.random.set_seed(42)
Code language: Python (python)
Let’s go through the above code line by line:
- The first line creates a Sequential model. This is the simplest kind of Keras model for neural networks that are just composed of a single stack of layers connected sequentially. This is called the Sequential API.
- Next, we build the first layer and add it to the model. It is Flatten layer whose role is to convert each input image into a 1D array. If it receives input data X, it computes X.reshape(-1,1). This layer does not have any parameters, it is just there to do some simple preprocessing. Since it is the first layer in the model, you should specify the input_shape, which doesn’t include the batch size, only the shape of the instances. Alternatively, you could add a keras.layers.InputLayer as the first layer, setting input _shape = [28,28].
- Next we add a Dense hidden layer with 300 neurons.It will use the ReLU activation function. Each Dense layer manages its own weight matrix, containing all the connection weights between the neurons and their inputs. It also manages a vector of bias term.
- Then we add a second Dense hidden layer with 100 neurons, also using the ReLU activation function.
- Finally, we add a Dense output layer with 10 neurons, using the softmax qctivation function.
Instead of adding the layers one by one as we just did, you can pass a list of layers when creating the Sequential model:
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(300, activation="relu"),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(10, activation="softmax")
])
model.layers
Code language: Python (python)
The model’s summary() method will display all the model’s layers. including each layer’s name, it’s output shape, and it’s number of parameters, including trainable and non-trainable parameters.
model.summary()
Code language: Python (python)
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= flatten (Flatten) (None, 784) 0 _________________________________________________________________ dense (Dense) (None, 300) 235500 _________________________________________________________________ dense_1 (Dense) (None, 100) 30100 _________________________________________________________________ dense_2 (Dense) (None, 10) 1010 ================================================================= Total params: 266,610 Trainable params: 266,610 Non-trainable params: 0
keras.utils.plot_model(model, "my_fashion_mnist_model.png", show_shapes=True)
Code language: Python (python)

hidden1 = model.layers[1]
hidden1.name
Code language: Python (python)
dense
model.get_layer(hidden1.name) is hidden1
Code language: Python (python)
True
weights, biases = hidden1.get_weights()
Code language: Python (python)
array([[ 0.02448617, -0.00877795, -0.02189048, ..., -0.02766046, 0.03859074, -0.06889391], [ 0.00476504, -0.03105379, -0.0586676 , ..., 0.00602964, -0.02763776, -0.04165364], [-0.06189284, -0.06901957, 0.07102345, ..., -0.04238207, 0.07121518, -0.07331658], ..., [-0.03048757, 0.02155137, -0.05400612, ..., -0.00113463, 0.00228987, 0.05581069], [ 0.07061854, -0.06960931, 0.07038955, ..., -0.00384101, 0.00034875, 0.02878492], [-0.06022581, 0.01577859, -0.02585464, ..., -0.00527829, 0.00272203, -0.06793761]], dtype=float32)
biases
Code language: Python (python)
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)
Compiling the Image Classification Model
After a model is created, you must call its compile() methid to specify that the loss function and the optimizer to use. Optionally, you can specify a list of extra metrices to compute during training and evaluation:
model.compile(loss="sparse_categorical_crossentropy",
optimizer="sgd",
metrics=["accuracy"])
Code language: Python (python)
Training and Evaluating the Image Classification Model
Now the model is ready to be trained. For this we simply need to call its fit() method:
history = model.fit(X_train, y_train, epochs=30,
validation_data=(X_valid, y_valid))
Code language: Python (python)
The fit() method returns a History object containing the training parameters, the list of epochs it went through, and most importantly a dictionary containing the loss and extra metrics it measured at the end of each epoch on the training set and on the validation set. If you use this dictionary to create a pandsa DataFrame and call its plot(), then you can see the learning curves of our trained model:
import pandas as pd
pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1)
save_fig("keras_learning_curves_plot")
plt.show()
Code language: Python (python)

You can see that both the training accuracy and the validation accuracy steadily increase during training, while the training loss and the validation loss decrease.
Once you are satisfied with your model’s validation accuracy, you should evaluate it on a test set to estimate the generalization error before you deploy it to the production. You can easily do this using the evaluate() method:
model.evaluate(X_test, y_test)
Code language: Python (python)
313/313 [==============================] - 0s 2ms/step - loss: 0.3382 - accuracy: 0.8822 [0.3381877839565277, 0.8822000026702881]
Use the Model to Make Predictions
Next, we can use the model’s predict() method to make predictions on new instances. Since we don’t have actual new instances, we will just use the first three instances of the test set:
X_new = X_test[:3]
y_proba = model.predict(X_new)
y_proba.round(2)
Code language: Python (python)
array([[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.03, 0. , 0.96], [0. , 0. , 0.99, 0. , 0.01, 0. , 0. , 0. , 0. , 0. ], [0. , 1. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. ]], dtype=float32)
y_pred = model.predict_classes(X_new)
y_pred
Code language: Python (python)
array([9, 2, 1])
np.array(class_names)[y_pred]
Code language: Python (python)
array([‘Ankle boot’, ‘Pullover’, ‘Trouser’], dtype='<U11′)
Here, the classification model actually classified all three images correctly:
y_new = y_test[:3]
plt.figure(figsize=(7.2, 2.4))
for index, image in enumerate(X_new):
plt.subplot(1, 3, index + 1)
plt.imshow(image, cmap="binary", interpolation="nearest")
plt.axis('off')
plt.title(class_names[y_test[index]], fontsize=12)
plt.subplots_adjust(wspace=0.2, hspace=0.5)
save_fig('fashion_mnist_images_plot', tight_layout=False)
plt.show()
Code language: Python (python)

Also, Read – Fake News Detection Model.
I hope you liked this article on Image Classification with Artificial Neural Networks (ANN). Feel free to ask your valuable questions in the comments section below.
[…] Also, Read: Image Classification with Neural Networks. […]
This image classification is executed successfully.thanks for the useful post
Thanks for your valuable feedback. Keep visiting us