TensorFlow 2.0 Tutorial - Training the Model - Text Classification P3

Neural Networks and Text Classification: An Exploration of Accuracy and Performance

In this exploration, we delve into the world of neural networks and text classification, examining the importance of accuracy and performance. Our model, trained on a dataset of 50,000 reviews, demonstrated an impressive accuracy rate of 87% on the test data. However, it's essential to note that accuracy can vary when testing new data, and a lower accuracy rate may be observed. This highlights the significance of having a robust validation process to ensure the model is performing as expected.

To gain further insight into our model's performance, we conducted an experiment using a review from the test data. The review was fed into the model, which predicted the correct label for the review. However, upon closer inspection, we noticed that the model struggled with unknown characters and tags, such as B R, which could be indicative of HTML markup. This underscores the challenges of training models to handle nuanced text data.

One of the benefits of our model is its ability to make predictions on new, unseen data. For instance, when predicting the label for a review that started with "please give this one a Miss", the model accurately identified it as such. However, the review also contained some emojis and unknown characters, which may have affected the accuracy of the prediction.

The performance of our model is also influenced by its architecture and training process. The fact that we were able to train the model relatively quickly, with only 50,000 test data, suggests that string data can be processed efficiently. Nevertheless, as we move forward in future videos, we aim to train models that take longer to train, potentially spanning several days or hours.

In conclusion, our exploration of neural networks and text classification has provided valuable insights into the importance of accuracy and performance. By examining our model's performance on test data and conducting an experiment using a review from the same dataset, we have gained a deeper understanding of its capabilities and limitations. In the next video, we will build upon this knowledge by exploring how to save and reuse our trained models for faster predictions.

Saving and Reusing Trained Models: A Game-Changer in Machine Learning

In machine learning, training a model from scratch can be time-consuming, especially when dealing with complex tasks such as text classification. However, there is a way to bypass this process altogether by saving and reusing trained models. This approach enables us to make predictions instantly, without waiting for the model to train again.

To demonstrate this concept, we will create a script that saves our trained model to a file and then loads it when needed. This allows us to reuse the same model for multiple predictions, significantly reducing the training time associated with each prediction.

Here is an example of how we can achieve this:

```python

# Save the model to a file

import pickle

model = our_trained_model

with open('saved_model.pkl', 'wb') as f:

pickle.dump(model, f)

```

To load and reuse the saved model for predictions, we can use the following script:

```python

# Load the saved model from a file

import pickle

with open('saved_model.pkl', 'rb') as f:

loaded_model = pickle.load(f)

# Make predictions using the loaded model

review = "please give this one a Miss"

prediction = loaded_model.predict(review)

print(prediction)

```

By saving and reusing our trained models, we can significantly improve the efficiency of our machine learning workflows. In future videos, we will delve deeper into this topic and explore more advanced techniques for optimizing model performance.

Interpreting Results: A Closer Look at Our Model's Performance

To gain a better understanding of our model's performance, we conducted an experiment using a review from the test data. The review was fed into the model, which predicted the correct label for the review. However, upon closer inspection, we noticed that the model struggled with unknown characters and tags, such as B R, which could be indicative of HTML markup.

The review also contained some emojis and unknown characters, which may have affected the accuracy of the prediction. For instance, the review started with "please give this one a Miss", but the model predicted it as such. However, upon closer inspection, we noticed that the review was actually rendered poorly by the show's cast, with the performance being flat and unengaging.

The model's ability to handle nuanced text data is crucial for its success in applications such as sentiment analysis and text classification. By examining our model's performance on test data, we can identify areas where it needs improvement and make adjustments accordingly.

Conclusion

In this exploration of neural networks and text classification, we demonstrated the importance of accuracy and performance in machine learning models. Our model, trained on a dataset of 50,000 reviews, achieved an impressive accuracy rate of 87% on the test data. However, it's essential to note that accuracy can vary when testing new data, and a lower accuracy rate may be observed.

We also explored how to save and reuse our trained models for faster predictions, which significantly improves the efficiency of our machine learning workflows. In future videos, we will delve deeper into this topic and explore more advanced techniques for optimizing model performance.

By examining our model's performance on test data and conducting experiments using reviews from the same dataset, we gained a deeper understanding of its capabilities and limitations. This exploration highlights the significance of having a robust validation process to ensure the model is performing as expected and demonstrates the importance of accuracy and performance in machine learning models.

"WEBVTTKind: captionsLanguage: enall right so now it's time to compile and train our model now the first thing we have to do is just define the model give it an optimizer give it a loss function and then I think we have to define the metrics as well so we're gonna do is gonna say model equals in this case or so not model equals model dot compile if I spell compile correctly and then here we're gonna say optimizer we're gonna use the atom optimizer again I'm not really going to talk about what these are that much if you're interested in the optimizer just look them up and then for the loss function where you're going to use the binary underscore cross and Trippi now what this one essentially is is well binary means like two options right and our case we want to have two options for the output neuron which is 0 or 1 so what's actually happening here is we have the sigmoid function which means our numbers gonna be between 0 & 1 but what the loss function will do is pretty well calculate the difference between for example say our output neuron is like 0.2 and the actual answer was 0 well it will give us a certain function that can calculate the loss so how much of a difference a zero point two is from zero and that's kind of how that works again I'm not gonna talk about them too much and they're not like I mean they are important but nots really like memorize per se like you kind of just mess with different ones but in this case binary cross-entropy works well because we have two possible values 0 1 so rather than using the other one that we used before which I don't even remember what it was called something cross-entropy we're using binary cross entropy okay so now what we're gonna do is we're actually gonna split our training data into two sets and the first set of our training data is gonna be called validation data or really I asked you can think of it as a second the word it doesn't really matter but what we're gonna do is just get some validation data and what validation data is is essentially we can check how well our model is performing based on the tunes and tweaks we're doing on the training data on new data now the reason we do that is so that we can get a more accurate sense of how well our model is because we're gonna be testing new data to get the accuracy each time rather than testing it on data that we've already seen before which again means that the can't simply just memorize each review and give us either a zero or one for that it has to actually have some degree of I don't know like thinking or operation so that it can work on new data so we're gonna do is gonna say X underscore Val equals and all we're gonna do is just grab the train data and we're just gonna cut it to a thousand or ten thousand entries so there's actually twenty five thousand entries or I guess reviews in our training data so we're just gonna take ten thousand of it and say we're going to use that as validation data now in terms of the size of validation data it doesn't really matter that much this is what tensorflow is using so I'm just kind of going with that but again mess with these numbers and see what happens to your model everything with our neural networks and machine learning really is gonna come down to very fine what's known as hyper parameters or like hyper tuning which means just changing individual parameters each time until we get a model that is well just better and more accurate so we're gonna say that X value equals that but then we're also gonna have to modify our X train data to be Train underscore data and in this case we're just gonna do the other way around so ten thousand : now I'll just copy this and we're just gonna replace this again with instead of test actually oh we have to do this with labels sorry what am I thinking so we're just gonna train change this to be labels and then instead of X value is just gonna be Y value and then wide train um so yeah we're not touching the test static because we're gonna use all that test data to test our model and then we're just gonna use the the training stuff for the validation data to validate the model alright so now that we've done that it is actually time to fit the model so I'm just gonna say uh like fit model and you'll see what I'd name this something different in a second who's gonna be equal to model dot fit and in this case what we're gonna do is going to say X underscore train Y underscore train we're gonna say epochs is equal to angle tights about 40 and again you can mess with this number and see what we get based on that and then say batch underscore size equals 512 which I'll talk about in a second and then finally we're gonna say validation underscore data equals and in here we're gonna say X underscore Val why underscore Valley and I think that's it let me just check here quickly a one last thing that I forgot to do we're gonna say verbose equals one verbose equals one now I'm not gonna lie I honestly don't know what verbose is I probably should look it up before the video but I have no idea what that is so someone knows please let me know but the badge size is essentially how many what do you call it um movie reviews we're gonna do each time or how many we're gonna load in at once because this thing is it's kind of I mean we're loading all of our reviews into memory but in some cases we won't be able to do that and we won't be able to like feed the model all of our reviews on each single cycle so we just set up a batch size that's gonna define essentially how many at once we're gonna give and I know I'm kind of horribly explaining what a batch sizes but we'll get into more on batch sizes and how we can kind of do like buffering through our data and like going taking some from a text file and reading into memory in later videos when we have like hundreds of gigabytes of data that we're gonna be working with okay so finally we're gonna say results equals and in this case I believe it is model dot evaluates and then we're gonna evaluate this obviously on our test data so we're gonna give it test data and test labels so test underscore data test underscore labels like that and then finally what I'm gonna do is just actually print out the results so we can see what our accuracy is so say print results and then get that value so let me run this quickly neural networks text classification let's go see MD and then python text or that's not even when we're using a reasoning tutorial - sorry and let's see what we get with this this will take a second to run through the epoch so I'll fast-forward through that so you guys don't have to wait alright so we just finished doing the epochs now and essentially our accuracy was 87% and this first number I believe is the loss which is 0.33 and then you can see that actually here we get the accuracy values and know to set the accuracy from our last epoch was actually greater than the accuracy on the test data which again shows you that sometimes you know when you test it on new data you're gonna be getting a less accurate model or in some cases you might even get a more accurate model it really just you can't strictly go based off what you're getting on your training data you really do need to have some test and validation data to make sure those models correctly working so that's essentially what we've done there and yeah I mean that that's the model we tested it's 87 percent accurate so now let's actually have let's interpret some of these results a little bit better and let's show some reviews let's do a prediction on some of the reviews and then see like if this our model kind of makes sense for what's going on here so what I'm going to do is I'm just going to actually just copy some output that I have here just save us a bit of time because I am gonna wrap up the video in a minute here but essentially what this does it just takes the first review from test data gets the model to predict that because we obviously we didn't train it on the test data so we can do that fine we're gonna say review and then we print out the decoded review we're gonna print out what the model predicted and then we're gonna print out what the actual label of that was so if I run this now I'll fast forward through the kind of training process and we will see the other all right so this is what essentially our review looks like so at least the one that we were testing it on and you can see that we have these little start tag and it says please give this one a Miss for and then B R stands for like brake line or go to the next line so we could have actually added another tag for B R if we notice that this was used a lot in the review but we didn't do that so you see B R unless this is actually part of the review but I feel like that should be like brake line in terms of HTML anyways and we have some unknown characters which could be anything that we just didn't know it was and it says and the rest of the cast rendered terrible performance as the show is flat flat flat brbr i don't know how Michael Madison could have allowed this one on his plate he almost seemed he what does it seem to know this wasn't going to work out and his performance was quite unknown so all yeah so anyways you can see that this probably had like some emojis and or something and that's why we have all these unknowns and then obviously we made this which was pretty short to be the full length of 250 so we see all these pads that did that for us and then we have a prediction and an actual value of zero so we did end up getting this one correct now I think it'd be interesting actually to write your own review and test it on this so in the next video what I'm gonna do is show you how we can save the model to avoid doing like all of this every time we want to run the code because realistically we don't wanna wait like a minute or two before we can predict a movie review every time we just wanted to happen instantly and we definitely can do that I just haven't showed that yet in the series because that's kind of in like later what you do after you have machine learning and obviously like this this model trained pretty quickly like we only had about what was it like fifty thousand test data set which I it seems like a large number but it's really not especially when you're talking about string data so in future videos we're gonna be training models that take like maybe a few days to train at least that's the goal or maybe a few hours or something like that so in that case you're probably not gonna want to train it every time before you predict some information so that'll be useful to know how to save that so that being said I'm gonna end the video here I hope you guys enjoyed this and in the next video I will be showing you guys how to save the model and how to make predictions on our own written reviewsall right so now it's time to compile and train our model now the first thing we have to do is just define the model give it an optimizer give it a loss function and then I think we have to define the metrics as well so we're gonna do is gonna say model equals in this case or so not model equals model dot compile if I spell compile correctly and then here we're gonna say optimizer we're gonna use the atom optimizer again I'm not really going to talk about what these are that much if you're interested in the optimizer just look them up and then for the loss function where you're going to use the binary underscore cross and Trippi now what this one essentially is is well binary means like two options right and our case we want to have two options for the output neuron which is 0 or 1 so what's actually happening here is we have the sigmoid function which means our numbers gonna be between 0 & 1 but what the loss function will do is pretty well calculate the difference between for example say our output neuron is like 0.2 and the actual answer was 0 well it will give us a certain function that can calculate the loss so how much of a difference a zero point two is from zero and that's kind of how that works again I'm not gonna talk about them too much and they're not like I mean they are important but nots really like memorize per se like you kind of just mess with different ones but in this case binary cross-entropy works well because we have two possible values 0 1 so rather than using the other one that we used before which I don't even remember what it was called something cross-entropy we're using binary cross entropy okay so now what we're gonna do is we're actually gonna split our training data into two sets and the first set of our training data is gonna be called validation data or really I asked you can think of it as a second the word it doesn't really matter but what we're gonna do is just get some validation data and what validation data is is essentially we can check how well our model is performing based on the tunes and tweaks we're doing on the training data on new data now the reason we do that is so that we can get a more accurate sense of how well our model is because we're gonna be testing new data to get the accuracy each time rather than testing it on data that we've already seen before which again means that the can't simply just memorize each review and give us either a zero or one for that it has to actually have some degree of I don't know like thinking or operation so that it can work on new data so we're gonna do is gonna say X underscore Val equals and all we're gonna do is just grab the train data and we're just gonna cut it to a thousand or ten thousand entries so there's actually twenty five thousand entries or I guess reviews in our training data so we're just gonna take ten thousand of it and say we're going to use that as validation data now in terms of the size of validation data it doesn't really matter that much this is what tensorflow is using so I'm just kind of going with that but again mess with these numbers and see what happens to your model everything with our neural networks and machine learning really is gonna come down to very fine what's known as hyper parameters or like hyper tuning which means just changing individual parameters each time until we get a model that is well just better and more accurate so we're gonna say that X value equals that but then we're also gonna have to modify our X train data to be Train underscore data and in this case we're just gonna do the other way around so ten thousand : now I'll just copy this and we're just gonna replace this again with instead of test actually oh we have to do this with labels sorry what am I thinking so we're just gonna train change this to be labels and then instead of X value is just gonna be Y value and then wide train um so yeah we're not touching the test static because we're gonna use all that test data to test our model and then we're just gonna use the the training stuff for the validation data to validate the model alright so now that we've done that it is actually time to fit the model so I'm just gonna say uh like fit model and you'll see what I'd name this something different in a second who's gonna be equal to model dot fit and in this case what we're gonna do is going to say X underscore train Y underscore train we're gonna say epochs is equal to angle tights about 40 and again you can mess with this number and see what we get based on that and then say batch underscore size equals 512 which I'll talk about in a second and then finally we're gonna say validation underscore data equals and in here we're gonna say X underscore Val why underscore Valley and I think that's it let me just check here quickly a one last thing that I forgot to do we're gonna say verbose equals one verbose equals one now I'm not gonna lie I honestly don't know what verbose is I probably should look it up before the video but I have no idea what that is so someone knows please let me know but the badge size is essentially how many what do you call it um movie reviews we're gonna do each time or how many we're gonna load in at once because this thing is it's kind of I mean we're loading all of our reviews into memory but in some cases we won't be able to do that and we won't be able to like feed the model all of our reviews on each single cycle so we just set up a batch size that's gonna define essentially how many at once we're gonna give and I know I'm kind of horribly explaining what a batch sizes but we'll get into more on batch sizes and how we can kind of do like buffering through our data and like going taking some from a text file and reading into memory in later videos when we have like hundreds of gigabytes of data that we're gonna be working with okay so finally we're gonna say results equals and in this case I believe it is model dot evaluates and then we're gonna evaluate this obviously on our test data so we're gonna give it test data and test labels so test underscore data test underscore labels like that and then finally what I'm gonna do is just actually print out the results so we can see what our accuracy is so say print results and then get that value so let me run this quickly neural networks text classification let's go see MD and then python text or that's not even when we're using a reasoning tutorial - sorry and let's see what we get with this this will take a second to run through the epoch so I'll fast-forward through that so you guys don't have to wait alright so we just finished doing the epochs now and essentially our accuracy was 87% and this first number I believe is the loss which is 0.33 and then you can see that actually here we get the accuracy values and know to set the accuracy from our last epoch was actually greater than the accuracy on the test data which again shows you that sometimes you know when you test it on new data you're gonna be getting a less accurate model or in some cases you might even get a more accurate model it really just you can't strictly go based off what you're getting on your training data you really do need to have some test and validation data to make sure those models correctly working so that's essentially what we've done there and yeah I mean that that's the model we tested it's 87 percent accurate so now let's actually have let's interpret some of these results a little bit better and let's show some reviews let's do a prediction on some of the reviews and then see like if this our model kind of makes sense for what's going on here so what I'm going to do is I'm just going to actually just copy some output that I have here just save us a bit of time because I am gonna wrap up the video in a minute here but essentially what this does it just takes the first review from test data gets the model to predict that because we obviously we didn't train it on the test data so we can do that fine we're gonna say review and then we print out the decoded review we're gonna print out what the model predicted and then we're gonna print out what the actual label of that was so if I run this now I'll fast forward through the kind of training process and we will see the other all right so this is what essentially our review looks like so at least the one that we were testing it on and you can see that we have these little start tag and it says please give this one a Miss for and then B R stands for like brake line or go to the next line so we could have actually added another tag for B R if we notice that this was used a lot in the review but we didn't do that so you see B R unless this is actually part of the review but I feel like that should be like brake line in terms of HTML anyways and we have some unknown characters which could be anything that we just didn't know it was and it says and the rest of the cast rendered terrible performance as the show is flat flat flat brbr i don't know how Michael Madison could have allowed this one on his plate he almost seemed he what does it seem to know this wasn't going to work out and his performance was quite unknown so all yeah so anyways you can see that this probably had like some emojis and or something and that's why we have all these unknowns and then obviously we made this which was pretty short to be the full length of 250 so we see all these pads that did that for us and then we have a prediction and an actual value of zero so we did end up getting this one correct now I think it'd be interesting actually to write your own review and test it on this so in the next video what I'm gonna do is show you how we can save the model to avoid doing like all of this every time we want to run the code because realistically we don't wanna wait like a minute or two before we can predict a movie review every time we just wanted to happen instantly and we definitely can do that I just haven't showed that yet in the series because that's kind of in like later what you do after you have machine learning and obviously like this this model trained pretty quickly like we only had about what was it like fifty thousand test data set which I it seems like a large number but it's really not especially when you're talking about string data so in future videos we're gonna be training models that take like maybe a few days to train at least that's the goal or maybe a few hours or something like that so in that case you're probably not gonna want to train it every time before you predict some information so that'll be useful to know how to save that so that being said I'm gonna end the video here I hope you guys enjoyed this and in the next video I will be showing you guys how to save the model and how to make predictions on our own written reviews\n"