AlexNet is a popular convolutional neural network architecture that won the ImageNet 2012 challenge by a large margin. It was developed by Alex Krizhevsky, Ilya Sutskever and Geoffery Hinton. It is similar to the LeNet-5 architecture but larger and deeper. If you want to learn more about the AlexNet CNN architecture, this article is for you. In this article, I’ll take you through an introduction to the AlexNet architecture and its implementation using Python.
AlexNet
The AlexNet CNN architecture is one of the popular architectures of Convolutional Neural Networks. It was the first CNN architecture to stack convolutional layers directly on top of one another. The table below describes the complete architecture of the AlexNet CNN architecture:
Layer | Type | Maps | Size | Kernel Size | Stride | Padding | Activation |
---|---|---|---|---|---|---|---|
Out | Fully Connected | – | 1000 | – | – | – | Softmax |
F10 | Fully Connected | – | 4096 | – | – | – | ReLU |
F9 | Fully Connected | – | 4096 | – | – | – | ReLU |
S8 | Max pooling | 256 | 6X6 | 3X3 | 2 | valid | – |
C7 | Convolution | 256 | 13X13 | 3X3 | 1 | same | ReLU |
C6 | Convolution | 384 | 13X13 | 3X3 | 1 | same | ReLU |
C5 | Convolution | 384 | 13X13 | 3X3 | 1 | same | ReLU |
S4 | Max pooling | 256 | 13X13 | 3X3 | 2 | valid | – |
C3 | Convolution | 256 | 27X27 | 5X5 | 1 | same | ReLU |
S2 | Max pooling | 96 | 27X27 | 3X3 | 2 | valid | – |
C1 | Convolution | 96 | 55X55 | 11X11 | 4 | valid | ReLU |
In | Input | 3(RGB) | 227X227 | – | – | – | – |
This Convolutional Neural Network architecture contains eight layers with weights, where the first five layers are convolutional, and the remaining three are fully-connected layers.

The first convolutional layer is an input layer that filters 227x227x3 input images with 96 kernels of size 11x11x3 with a stride of 4 pixels. The kernels of the second, fourth, and fifth convolutional layers are connected only to the kernel maps of the previous layer that reside on the same GPU. The kernels of the third convolutional layer are connected to all the kernel maps of the second convolutional layer, and the neurons in the fully-connected layers are connected to all the neurons in the previous layer.
So this is the overall architecture of the AlexNet CNN architecture. You can learn more about this architecture from this research paper. Now in the section below, I will take you through the implementation of AlexNet architecture using Python.
AlexNet Architecture using Python
I hope you have understood the complete architecture of AlexNet CNN architecture. To implement it using Python, we can use the TensorFlow and Keras library in Python. I will also use the VisualKeras library in Python to visualize the architecture of AlexNet. Below is how to implement AlexNet architecture using Python:
import keras from keras.models import Sequential,Input,Model from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.layers.advanced_activations import LeakyReLU import tensorflow as tf from tensorflow import keras import keras.layers as layers model = keras.Sequential() model.add(layers.Conv2D(filters=96, kernel_size=(11, 11), strides=(4, 4), activation="relu", input_shape=(227, 227, 3))) model.add(layers.BatchNormalization()) model.add(layers.MaxPool2D(pool_size=(3, 3), strides= (2, 2))) model.add(layers.Conv2D(filters=256, kernel_size=(5, 5), strides=(1, 1), activation="relu", padding="same")) model.add(layers.BatchNormalization()) model.add(layers.MaxPool2D(pool_size=(3, 3), strides=(2, 2))) model.add(layers.Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation="relu", padding="same")) model.add(layers.BatchNormalization()) model.add(layers.Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), activation="relu", padding="same")) model.add(layers.BatchNormalization()) model.add(layers.Conv2D(filters=256, kernel_size=(3, 3), strides=(1, 1), activation="relu", padding="same")) model.add(layers.BatchNormalization()) model.add(layers.MaxPool2D(pool_size=(3, 3), strides=(2, 2))) model.add(layers.Flatten()) model.add(layers.Dense(4096, activation="relu")) model.add(layers.Dropout(0.5)) model.add(layers.Dense(10, activation="softmax")) model.compile(loss='sparse_categorical_crossentropy', optimizer=tf.optimizers.SGD(lr=0.001), metrics=['accuracy']) model.summary()
Model: "sequential_4" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_23 (Conv2D) (None, 55, 55, 96) 34944 batch_normalization_20 (Bat (None, 55, 55, 96) 384 chNormalization) max_pooling2d_12 (MaxPoolin (None, 27, 27, 96) 0 g2D) conv2d_24 (Conv2D) (None, 27, 27, 256) 614656 batch_normalization_21 (Bat (None, 27, 27, 256) 1024 chNormalization) max_pooling2d_13 (MaxPoolin (None, 13, 13, 256) 0 g2D) conv2d_25 (Conv2D) (None, 13, 13, 384) 885120 batch_normalization_22 (Bat (None, 13, 13, 384) 1536 chNormalization) conv2d_26 (Conv2D) (None, 13, 13, 384) 1327488 batch_normalization_23 (Bat (None, 13, 13, 384) 1536 chNormalization) conv2d_27 (Conv2D) (None, 13, 13, 256) 884992 batch_normalization_24 (Bat (None, 13, 13, 256) 1024 chNormalization) max_pooling2d_14 (MaxPoolin (None, 6, 6, 256) 0 g2D) flatten_4 (Flatten) (None, 9216) 0 dense_8 (Dense) (None, 4096) 37752832 dropout_4 (Dropout) (None, 4096) 0 dense_9 (Dense) (None, 10) 40970 ================================================================= Total params: 41,546,506 Trainable params: 41,543,754 Non-trainable params: 2,752 _________________________________________________________________
Now here is how you can visualize the architecture of your neural network architecture:
import visualkeras visualkeras.layered_view(model)

Summary
The AlexNet CNN architecture is one of the popular architectures of Convolutional Neural Networks. It was the first CNN architecture to stack convolutional layers directly on top of one another. I hope you liked this article on an introduction to AlexNet CNN architecture and its implementation using Python. Feel free to ask your valuable questions in the comments section below.