How to Code A Neural Network From Scratch Part 2 - Processing the MNIST Data

Loading the Training Data

-------------------------

We begin by loading the training images and labels from file using numpy arrays. We import the numpy library and initialize two arrays, one for the image data and one for the corresponding labels. The image data is read into a binary mode array with four bytes per pixel, which represents 32-bit integers. Similarly, we load the test images by reading a binary mode array with four bytes per pixel and reshape them into 784-element row vectors. This will be used to flatten the 28x28 matrix into a single row vector.

The system is set up again for loading the training data, but this time we read in two sets of frames from files, and use np.float64 instead of int8 for the test image data, and np.uint8 for the frame image data. The number of images in the training dataset is 60,000 and the number of test images is 10,000.

Visualizing the Data

---------------------

To visualize the data, we create a function called visualized_data() that loads an image array and a label array using numpy arrays. We use matplotlib to display the images and labels in separate subplots. The first subplot shows the training set with an 8x8 grid of images, while the second subplot shows the test set with an 8x8 grid of images.

We take a look at some specific images, including the nines and the Ottoman number. We notice that the variation in how people write the number seven is quite interesting, with some people using a slash through it and others not. This will be important when designing our neural network to recognize handwritten digits.

Code Implementation

--------------------

The code implementation of the system consists of two main parts: loading the data and programming the actual basis for the normal Network.

To load the data, we create functions called load_data() that loads the training images and labels into a usable form. This involves reading in the binary mode arrays, flattening them into 784-element row vectors, and reshaping them back into 28x28 matrices.

The programming of the actual basis for the normal Network involves creating a class called NormalNet that will contain the activation functions and coding necessary for the neural network to work correctly. We'll cover this in more detail in the next video.

In conclusion, we've loaded the training data and visualized it using matplotlib. The data consists of 60,000 training images and 10,000 test images, each with a corresponding label. The variation in handwritten digits is quite interesting, particularly when it comes to recognizing the number seven. In the next video, we'll dive into programming the actual basis for the normal Network.