MNIST digit classification using CNN in Keras

 

In this article, you will learn how to build a Convolutional Neural Networks (CNNs) using Keras for image classification on MNIST dataset from scratch.

Load MNIST Dataset

Modified National Institute of Standards and Technology dataset (MNIST), is a standard dataset commonly used for handwritten digit classification problem. Although, the MNIST dataset is well understood and effectively solved it is useful for newbies for learning and exploring the working of Convolutional Neural Network for image classification problems. It is composed of grayscale images of 28 *28 pixel in size each of handwritten single digit number ranging from 0 to 9, containing 60,000 images in training dataset and 10,000 separate images in test dataset.

Let’s load the MNIST dataset using Keras API and print the shapes of training and testing data as illustrated below:

Running the above code will load the training and testing dataset as well as their respective labels and images in different variables. X_train and X_test will contain the images of training and testing dataset respectively and y_train and y_test will contain the labels of the corresponding images in train and test dataset respectively.

Example below will plot first nine images in training images using matplotlib.

 
MNIST digit classification using CNN in Keras

Preparing the MNIST Dataset

In Keras, 2D Convolution operation requires the input shape of image to be a 4 dimensional array however, the images in MNIST dataset are in 3 dimensional numpy array. To convert image data to 4D array, we need to reshape by adding depth of the image to the 4th dimension. The depth of image indicate the number of color channels in the image e,g for RGB image depth is 3 while for grayscale image depth is 1, as MNIST dataset contains all grayscale images we will put depth as 1.

An image is considered as a matrix of pixel values indicating a RGB code for each pixel ranging from 0 to 255. Now, we need to normalize our image data by normalizing the pixel values of grayscale image that is by rescaling it to [0,1] range. This can be done by dividing the pixel values by its maximum value i.e 255 as illustrated below, prior I have converted the pixel values to float to ensure the accuracy after division.

Next, as we know the labels contains integer values range from 0 to 9 which represent the class represented by corresponding images among total 10 classes. We are using one hot encoding to convert this integer value into a 10 channel one hot vector using to_categorical() utility function as shown below.

Define CNN Architecture

Next, we need to define the Convolutional Neural Network (CNN) for the MNIST digit classification problem.

Convolutional Neural Networks (CNN) are artificial neural networks for computer vision tasks and have proven effective in object detection, image classification and face recognition applications. For building our CNN model we will use high level Keras API which uses Tensorflow backend. A CNN is consist of different operation such as convolution, pooling and classification. Hence to perform these operations, I will import the Sequential Model from Keras and add Conv2D, MaxPooling, Flatten, Dropout, and Dense layers.

A CNN can have as many layers depending upon the complexity of the given problem. As MNIST dataset do not require heavy computations and is simple I am using two convolutional layers with one dense layer and one output layer with 10 neurons since we have 10 classes in our proposed model. You may experiment with the number of layers as well as with different kernel size , pool size, activation functions  and dropout rate to get a more optimized result.

The summary of the CNN model as above:

  1. The first block is composed of convolutional layer with 32 number of kernels each of 3 x 3 size followed by a max pooling operation with pool size of 2 x 2.
  2. The second block also composed of convolution operation with 64 number of filters each of 3 x 3 size followed by a max pooling operation with pool size of 2 x 2 and a dropout of 20% to ensure the regularization and thus avoiding overfitting of the model.
  3. Next, there is a flattening operation which transforms the data to 1 dimensional so as to feed it to fully connected or dense layer. The first dense layer consists of 128 neurons with relu activation while the final output layer consist of 10 neurons with softmax activation which will output the probability for each of the 10 classes.

 

Compile and train the CNN model

For compiling the model I have chosen optimizer Adam of learning rate 0.001 along with categorical cross entropy loss function which is best for multi-class classification problem, and train the model with a batch size of 32 for around 10 epochs. For training I have used test data as validation data instead of splitting the training data so that the model will have enough data for training.

Once training is done, we can save the model as H5 file for future use. During training the model, you can experiment with the batch size, number of epochs with different loss function and optimizer to get better result and some intuition to select the appropriate one.

Plot the learning curve

The example below will plot the accuracy and loss over train and validation data for the CNN model.

MNIST digit classification using CNN in Keras

MNIST digit classification using CNN in Keras

 

By observing the learning curve we can see that the training and validation accuracy continues to improve as the number of epochs while the train and validation loss continues to shrink, thus we can conclude that the CNN model converges well on the train and validation data.

Evaluate the CNN model

After training the model on training dataset we need to evaluate it on test dataset to check the fitness of our proposed model for the given problem and plot the confusion matrix for better visualization of the results.

The example below will print the test accuracy and loss for the CNN model along with the confusion metrics.

Plot the result

So now, let’s plot our result on the first nine images in test dataset along with its true class and predicted class, as illustrated above.

 

MNIST digit classification using CNN in Keras

Conclusion

Congratulation!👏 You have successfully learned and implemented a Convolutional Neural Network using Keras for a MNIST handwritten digits classification problem all by your own.😎

Thanks for reading this article, let me know if you have any queries or suggestions in the comment section below.

 

Comments