How to Code A Neural Network From Scratch Part 2 - Processing the MNIST Data

Loading the Training Data

-------------------------

We begin by loading the training images and labels from file using numpy arrays. We import the numpy library and initialize two arrays, one for the image data and one for the corresponding labels. The image data is read into a binary mode array with four bytes per pixel, which represents 32-bit integers. Similarly, we load the test images by reading a binary mode array with four bytes per pixel and reshape them into 784-element row vectors. This will be used to flatten the 28x28 matrix into a single row vector.

The system is set up again for loading the training data, but this time we read in two sets of frames from files, and use np.float64 instead of int8 for the test image data, and np.uint8 for the frame image data. The number of images in the training dataset is 60,000 and the number of test images is 10,000.

Visualizing the Data

---------------------

To visualize the data, we create a function called visualized_data() that loads an image array and a label array using numpy arrays. We use matplotlib to display the images and labels in separate subplots. The first subplot shows the training set with an 8x8 grid of images, while the second subplot shows the test set with an 8x8 grid of images.

We take a look at some specific images, including the nines and the Ottoman number. We notice that the variation in how people write the number seven is quite interesting, with some people using a slash through it and others not. This will be important when designing our neural network to recognize handwritten digits.

Code Implementation

--------------------

The code implementation of the system consists of two main parts: loading the data and programming the actual basis for the normal Network.

To load the data, we create functions called load_data() that loads the training images and labels into a usable form. This involves reading in the binary mode arrays, flattening them into 784-element row vectors, and reshaping them back into 28x28 matrices.

The programming of the actual basis for the normal Network involves creating a class called NormalNet that will contain the activation functions and coding necessary for the neural network to work correctly. We'll cover this in more detail in the next video.

In conclusion, we've loaded the training data and visualized it using matplotlib. The data consists of 60,000 training images and 10,000 test images, each with a corresponding label. The variation in handwritten digits is quite interesting, particularly when it comes to recognizing the number seven. In the next video, we'll dive into programming the actual basis for the normal Network.

"WEBVTTKind: captionsLanguage: eneverybody welcome back to part two for a series of building a neural network from scratch if you haven't watched part one already go ahead and take a look it kind of said that the motivation for what we're trying to do and why it'll also give you some important information on how neural networks work so let's jump right in so we're going to be building a neural network to process the data set of handwritten images and if you're not familiar with it this is a training set of 60,000 examples with labels of the digits 0 through 9 there are represented as 28 by 28 matrices of greyscale digits and is kind of a benchmark sort of data set for machine learning algorithms and you find it by doing the quick Google search for yarmulkes websites or just do n this database of handwritten digits and I should pop right up so you download 4 different files of training images and labels as well as the test images and labels it's not too large just a few hundred megabytes and you want to download that directly into your working directory that makes addressing the files a little bit easier now there is a catch to this so when we read files typically we deal with text files right we are reading in text even text representations of numbers in this case we're dealing with actual bytes so we're going to have to do a little bit of work to get this into a form that we can actually work with and so this video is going to deal strictly with acquiring the data and processing and then we'll do some visualization so you can see what it looks like but if you take a look at their page they give us some benchmarks so test error rates for various algorithms we won't get to actually run in the network in this video but down the line we'll be able to compare our numbers to these and see how we do the important thing to know is that this data is stored as byte files and so we are going to need to do some transformations and luckily they give us some information on how the file is laid out you can see that the training set label files have couple of offset integers 32-bit integers or four bytes that's represented by this offset of four here and they tell you the magic number and number of items we're going to treat these two as throwaways because we're not going to use them for anything but we do have to peek into the file and move past this data and then the actual data is represented as unsigned bytes so just eight bits and likewise for the training set image files there are four different 32-bit integers at the beginning of the file that we're going to have to read in and we will use the number of images for that because we're going to have to take that twenty by twenty eight array and flatten it into a row vector so this should be pretty straightforward so let's go ahead and get started so we're going to need something called struct to handle the transformation from the by files into Python bites and I'll show you the documentation on that in a moment and we're also going to be using numpy because all these operations are going to be performed on numpy arrays and we're going to want to visualize the data and so we're going to want to import mat plot line that pipe line so if you don't have any of these you can just do clip install say mat drop-lid and i already have it so you know i'm not installing anything but if you didn't have it just doing TIFF install would install that package for you and same thing with numpy you should already have struct an OS though so first we want to download sorry define a function to load in our data and we're not going to be using an object-oriented approach to this because we're not going to be doing any real expansion of this later we're just going to do this all straight procedural style code so as usual with open and we'll start with our label files and creating labels I can type today that would be helpful now since we are reading in fights we have to do are be because we're reading binary data we're going to call this labels and it said we have to read the first two 32-bit integers in the file let's call it magic and and we're going to use the struct from us the struct object to that and I'll show you the syntax for this in a second we want to read eight bytes so let's take a look at the documentation for struct so it's a module that performs conversions between Python bodies and c structs represented as Python bites objects used for handling binary data stored in files are from network connections among other sources so that's exactly what we're dealing with we're doing with binary data in a file and so we want to actually unpack this data and we have to use this funny syntax right here and what this is doing is it is telling it as it big endian and the reason we need to do that is because of the way that the end of data is stored so it is stored and the format is consist of non Intel processors I'm using an Intel processor so I have to flip the bytes of the header into big-endian and that's why I need this greater than sign here and then the two eyes indicate capitalize indicate that I'm reading two unsigned integers and we also want to pack it so let's do a search for unpack really quick the where is the syntax I'm not seeing struct out unpack so format and buffer so our buffer is just labels that read just eight bits the first sorry eight bytes of the data file and so now that we have advanced into the data file we are actually able to start accessing the data we really want which is the unsigned bytes that represent our labels so then we just dump this into something called cream labels and we want to store it in the numpy array from file labels and these are unsigned integers of april's and that's all there is to it now let's go ahead and open up the training images which we'll do we'll have to do some similar trickery again we want to read a binary mode one just call it images now in this case we have four bytes we have to read another it's for 32-bit integers and we're going to actually use the number of images because we have to take the 28 by 28 matrix and flatten it into a 784 which is 28 times 28 a 784 elements row vector and our training set structure will be just a matrix of these row vectors so this M rows and calls now we have the system beginning again but we're reading for set of two and we're going to want to read 16 set of 8 and we want frame images equal NP from file and it is B type equal numpy unit 8 but we want to reshape this as I said we've got a flattening into the number of rows which is the number of images and 7 or 84 columns and then this is all the heavy lifting we just repeat the same thing for the test data so I'm just going to copy paste and these are T 10k instead of train and we want the test and set of frames but the data is laid out exactly the same so we don't have to copy all of that again and then we want to return training images training labels test images test labels and so that is all that is really required to load the data into a usable form so that isn't particularly useful and I want to show you how this data actually looks so we're going to use another function we're going to call visualized data and we're going to load it an image array and a label array and so we're going to use Matt plot live to do this so we'll say big ax is plot that subplots and let's take an 8x8 image side plot these are going to share the x and y axes now let's take a look at the first 64 images which is a course number of rows x number of columns and we're going to want to find where we'll pick a number let's say we want to take a look at the nines first and we recall that we flattened this into a 784 element array so we want to reshape this back to 28 by 28 image representation and we want to show this and you can use different color maps I'll just use gray some interpolation and very important you want to show the plot outside of the for loop you definitely don't want to put a plot into the for loop you'll get a mess so that is all we really need to get rolling so let's say we want to take a look at this data so train X Train Y where X is our images and y2 our labels test X test y equal load data so now we have our our data and now let's go ahead and visualize that data and look at the training set all right now we'll see what we get oh we get a error so it says read of closed file what have I done oh no line 11 as images train - images oh that's why images that read I was trying to read a closed file which is labels right of course that won't work and I probably did something similar because I just copied and pasted it of course I use labels images okay and it is thinking and here is our set of nines and you can see there is a fair amount of variation some of the needham look like jeez this one kind of looks like an eight and so that's where the error in our neural network is that it come from images that are a little bit ambiguous and can kind of look like other things this one almost looks like a seven for instance and so I expect our null network is going to have a little bit of work for itself take a look at some of the other numbers let's say Ottoman and see how that looks so the 7s have a similar story what's interesting is the variation in the way people write the number seven some people put a slash through it others don't this one kind of looks like I don't know just a squiggle so it will be interesting to see how the neural network handles this so running a little bit long so I'm going to go ahead and chop it here and our next set of videos we're going to get into programming the actual basis for the normal Network will talk about the activation functions and coding as well as encoding the labels into a one-hopper presentation and I look forward to seeing you in that next video if you like this go ahead and leave a thumbs up subscribe and I'll see you guys in the next part in our serieseverybody welcome back to part two for a series of building a neural network from scratch if you haven't watched part one already go ahead and take a look it kind of said that the motivation for what we're trying to do and why it'll also give you some important information on how neural networks work so let's jump right in so we're going to be building a neural network to process the data set of handwritten images and if you're not familiar with it this is a training set of 60,000 examples with labels of the digits 0 through 9 there are represented as 28 by 28 matrices of greyscale digits and is kind of a benchmark sort of data set for machine learning algorithms and you find it by doing the quick Google search for yarmulkes websites or just do n this database of handwritten digits and I should pop right up so you download 4 different files of training images and labels as well as the test images and labels it's not too large just a few hundred megabytes and you want to download that directly into your working directory that makes addressing the files a little bit easier now there is a catch to this so when we read files typically we deal with text files right we are reading in text even text representations of numbers in this case we're dealing with actual bytes so we're going to have to do a little bit of work to get this into a form that we can actually work with and so this video is going to deal strictly with acquiring the data and processing and then we'll do some visualization so you can see what it looks like but if you take a look at their page they give us some benchmarks so test error rates for various algorithms we won't get to actually run in the network in this video but down the line we'll be able to compare our numbers to these and see how we do the important thing to know is that this data is stored as byte files and so we are going to need to do some transformations and luckily they give us some information on how the file is laid out you can see that the training set label files have couple of offset integers 32-bit integers or four bytes that's represented by this offset of four here and they tell you the magic number and number of items we're going to treat these two as throwaways because we're not going to use them for anything but we do have to peek into the file and move past this data and then the actual data is represented as unsigned bytes so just eight bits and likewise for the training set image files there are four different 32-bit integers at the beginning of the file that we're going to have to read in and we will use the number of images for that because we're going to have to take that twenty by twenty eight array and flatten it into a row vector so this should be pretty straightforward so let's go ahead and get started so we're going to need something called struct to handle the transformation from the by files into Python bites and I'll show you the documentation on that in a moment and we're also going to be using numpy because all these operations are going to be performed on numpy arrays and we're going to want to visualize the data and so we're going to want to import mat plot line that pipe line so if you don't have any of these you can just do clip install say mat drop-lid and i already have it so you know i'm not installing anything but if you didn't have it just doing TIFF install would install that package for you and same thing with numpy you should already have struct an OS though so first we want to download sorry define a function to load in our data and we're not going to be using an object-oriented approach to this because we're not going to be doing any real expansion of this later we're just going to do this all straight procedural style code so as usual with open and we'll start with our label files and creating labels I can type today that would be helpful now since we are reading in fights we have to do are be because we're reading binary data we're going to call this labels and it said we have to read the first two 32-bit integers in the file let's call it magic and and we're going to use the struct from us the struct object to that and I'll show you the syntax for this in a second we want to read eight bytes so let's take a look at the documentation for struct so it's a module that performs conversions between Python bodies and c structs represented as Python bites objects used for handling binary data stored in files are from network connections among other sources so that's exactly what we're dealing with we're doing with binary data in a file and so we want to actually unpack this data and we have to use this funny syntax right here and what this is doing is it is telling it as it big endian and the reason we need to do that is because of the way that the end of data is stored so it is stored and the format is consist of non Intel processors I'm using an Intel processor so I have to flip the bytes of the header into big-endian and that's why I need this greater than sign here and then the two eyes indicate capitalize indicate that I'm reading two unsigned integers and we also want to pack it so let's do a search for unpack really quick the where is the syntax I'm not seeing struct out unpack so format and buffer so our buffer is just labels that read just eight bits the first sorry eight bytes of the data file and so now that we have advanced into the data file we are actually able to start accessing the data we really want which is the unsigned bytes that represent our labels so then we just dump this into something called cream labels and we want to store it in the numpy array from file labels and these are unsigned integers of april's and that's all there is to it now let's go ahead and open up the training images which we'll do we'll have to do some similar trickery again we want to read a binary mode one just call it images now in this case we have four bytes we have to read another it's for 32-bit integers and we're going to actually use the number of images because we have to take the 28 by 28 matrix and flatten it into a 784 which is 28 times 28 a 784 elements row vector and our training set structure will be just a matrix of these row vectors so this M rows and calls now we have the system beginning again but we're reading for set of two and we're going to want to read 16 set of 8 and we want frame images equal NP from file and it is B type equal numpy unit 8 but we want to reshape this as I said we've got a flattening into the number of rows which is the number of images and 7 or 84 columns and then this is all the heavy lifting we just repeat the same thing for the test data so I'm just going to copy paste and these are T 10k instead of train and we want the test and set of frames but the data is laid out exactly the same so we don't have to copy all of that again and then we want to return training images training labels test images test labels and so that is all that is really required to load the data into a usable form so that isn't particularly useful and I want to show you how this data actually looks so we're going to use another function we're going to call visualized data and we're going to load it an image array and a label array and so we're going to use Matt plot live to do this so we'll say big ax is plot that subplots and let's take an 8x8 image side plot these are going to share the x and y axes now let's take a look at the first 64 images which is a course number of rows x number of columns and we're going to want to find where we'll pick a number let's say we want to take a look at the nines first and we recall that we flattened this into a 784 element array so we want to reshape this back to 28 by 28 image representation and we want to show this and you can use different color maps I'll just use gray some interpolation and very important you want to show the plot outside of the for loop you definitely don't want to put a plot into the for loop you'll get a mess so that is all we really need to get rolling so let's say we want to take a look at this data so train X Train Y where X is our images and y2 our labels test X test y equal load data so now we have our our data and now let's go ahead and visualize that data and look at the training set all right now we'll see what we get oh we get a error so it says read of closed file what have I done oh no line 11 as images train - images oh that's why images that read I was trying to read a closed file which is labels right of course that won't work and I probably did something similar because I just copied and pasted it of course I use labels images okay and it is thinking and here is our set of nines and you can see there is a fair amount of variation some of the needham look like jeez this one kind of looks like an eight and so that's where the error in our neural network is that it come from images that are a little bit ambiguous and can kind of look like other things this one almost looks like a seven for instance and so I expect our null network is going to have a little bit of work for itself take a look at some of the other numbers let's say Ottoman and see how that looks so the 7s have a similar story what's interesting is the variation in the way people write the number seven some people put a slash through it others don't this one kind of looks like I don't know just a squiggle so it will be interesting to see how the neural network handles this so running a little bit long so I'm going to go ahead and chop it here and our next set of videos we're going to get into programming the actual basis for the normal Network will talk about the activation functions and coding as well as encoding the labels into a one-hopper presentation and I look forward to seeing you in that next video if you like this go ahead and leave a thumbs up subscribe and I'll see you guys in the next part in our series\n"

How to Code A Neural Network From Scratch Part 2 - Processing the MNIST Data

Random Videos