Python Tutorial - Introducing Hyperparameters

Hyperparameters: The Knobs and Dials of Machine Learning

In the previous lesson, you learned about parameters, but now it's time to dive into hyperparameters. Hyperparameters are a crucial aspect of machine learning that can significantly impact the performance of your models. They are often referred to as "knobs and dials" because they need to be adjusted before the modeling process begins. Unlike parameters, which are learned during the modeling process, hyperparameters are set by the user and do not change during the training of a model.

Creating an Instance of the Estimator

To visualize the hyperparameters, we can create an instance of the estimator and print it out. This will show us all the different knobs and dials that we can set for our model. There are many hyperparameters to choose from, but what do they all mean? To understand this, we need to turn to the scikit-learn documentation.

For example, let's take a look at the `n_estimators` hyperparameter. According to the documentation, it tells us the data type and default value, as well as its definition. We can also see that we can set these hyperparameters when we create the estimator object. For instance, the default number of trees seems to be a little low, so let's set that to 100 while we're at it. Let's also set the criterion to be entropy. If we print out the model, we can see that the other default values remain the same, but those we explicitly overrode.

Logistic Regression Model

Now, let's move on to logistic regression models. We'll follow the same steps as before: create a logistic regression estimator and print it out. Yes, there are fewer hyperparameters for this model than for the random forest. However, some are more important than others, and we need to understand which ones are crucial.

Before we outline the important hyperparameters, let's acknowledge that there are some hyperparameters that definitely won't help with model performance. These are related to computational decisions or what information to retain for analysis with the random forest classifier. Hyperparameters like the number of calls to use will only speed up modeling time, but they won't assist with model performance in terms of accuracy. A random seed and whether to print out information during modeling also won't help.

Important Hyperparameters for Random Forest Models

So, what are some generally accepted important hyperparameters to tune for a random forest model? The `n_estimators` parameter is crucial, as it determines how many trees are in the forest. It's often set to 500 or even 1000, but note that higher values come with computational costs. The `max_features` parameter controls how many features to consider when splitting, which is vital for ensuring tree diversity and preventing overfitting.

Another important hyperparameter is the criterion, although its impact is relatively small compared to others. There are hundreds of machine learning algorithms out there, and learning which hyperparameters matter is knowledge that you'll build over time through various sources. For example, there are great academic papers that explore different combinations of hyperparameters for a specific algorithm on multiple datasets. You can also find excellent blogs and tutorials online that cover the basics.

The Best Way to Learn

One of the best ways to learn about hyperparameters is through practical experience. It's essential to research this topic yourself and build your knowledge base for efficient modeling. By understanding how to adjust your hyperparameters, you'll be able to create more accurate and robust models that achieve better results.

"WEBVTTKind: captionsLanguage: enin the previous lesson you learned what parameters are you will now learn what exactly hyper parameters are how to find and set them as well as some tips and tricks for prioritizing your efforts let's get started hyper parameters are something that you set before the modeling process begins you could think of them like the knobs and dials on an old radio you tuned the different dials and buttons and hope that a nice tune comes out the algorithm does not learn the value of these during the modeling process this is the crucial differentiator between hyper parameters and parameters whether you set it or the algorithm learns it and informs you we can easily see the hyper parameters by creating an instance of the estimator and printing it out here we create the estimator with default settings and call the print function on our estimator those are all our different knobs and dials we can set for our model there are a lot but what do they all mean for this we need to turn to the scikit-learn documentation let us take the example of the n estimators hyper parameter we can see in the documentation that it tells us the data type and the default value and it also provides a definition of what it means we can set the hyper parameters when we create the estimator object the default number of trees seems to be a little low so let us set that to be 100 whilst we are at it let us also set the criterion to be entropy if we print out the model we can see the other default values remain the same but those we said explicitly overrode the default values what about our logistic regression model what are the hyper parameters for that we follow the same steps firstly we create a logistic regression estimator then we print it out we can see there are less hyper parameters for this model than for the random forest some are more important than others but before we outline important ones there are some hyper parameters that definitely will not help model performance these are related to computational decisions or what information to retain for analysis with the random forest classifier these hyper parameters will not assist model performance how many calls to use will only speed up modeling time a random seed and whether to print out information as the modeling occurs also won't assist hence some hyper parameters you don't really need to train during your work there are some generally accepted important hyper parameters to tune for a random forest model the n estimators how many trees are in the forest should be set to a high value 500 or a thousand or even more is not uncommon noting that there are computational cost to higher values the max features controls how many features to consider when splitting which is vital to ensure tree diversity the next to control overfitting of individual trees The Criterion hyper parameter may have a small impact but is not generally a primary hyperparameters there are hundreds of machine learning algorithms out there and learning which hyper parameters matter is knowledge you will build over time from a variety of sources for example there are some great academic papers where people have tried many combinations of hyper parameters for a specific algorithm on many data sets these can be a very informative read you can also find great blogs and tutorials online and can solve the psychic learn documentation of course one of the best ways to learn is just more practical experience it is important you research this yourself to build your knowledge base for efficient modeling let's explore some hyper parameterin the previous lesson you learned what parameters are you will now learn what exactly hyper parameters are how to find and set them as well as some tips and tricks for prioritizing your efforts let's get started hyper parameters are something that you set before the modeling process begins you could think of them like the knobs and dials on an old radio you tuned the different dials and buttons and hope that a nice tune comes out the algorithm does not learn the value of these during the modeling process this is the crucial differentiator between hyper parameters and parameters whether you set it or the algorithm learns it and informs you we can easily see the hyper parameters by creating an instance of the estimator and printing it out here we create the estimator with default settings and call the print function on our estimator those are all our different knobs and dials we can set for our model there are a lot but what do they all mean for this we need to turn to the scikit-learn documentation let us take the example of the n estimators hyper parameter we can see in the documentation that it tells us the data type and the default value and it also provides a definition of what it means we can set the hyper parameters when we create the estimator object the default number of trees seems to be a little low so let us set that to be 100 whilst we are at it let us also set the criterion to be entropy if we print out the model we can see the other default values remain the same but those we said explicitly overrode the default values what about our logistic regression model what are the hyper parameters for that we follow the same steps firstly we create a logistic regression estimator then we print it out we can see there are less hyper parameters for this model than for the random forest some are more important than others but before we outline important ones there are some hyper parameters that definitely will not help model performance these are related to computational decisions or what information to retain for analysis with the random forest classifier these hyper parameters will not assist model performance how many calls to use will only speed up modeling time a random seed and whether to print out information as the modeling occurs also won't assist hence some hyper parameters you don't really need to train during your work there are some generally accepted important hyper parameters to tune for a random forest model the n estimators how many trees are in the forest should be set to a high value 500 or a thousand or even more is not uncommon noting that there are computational cost to higher values the max features controls how many features to consider when splitting which is vital to ensure tree diversity the next to control overfitting of individual trees The Criterion hyper parameter may have a small impact but is not generally a primary hyperparameters there are hundreds of machine learning algorithms out there and learning which hyper parameters matter is knowledge you will build over time from a variety of sources for example there are some great academic papers where people have tried many combinations of hyper parameters for a specific algorithm on many data sets these can be a very informative read you can also find great blogs and tutorials online and can solve the psychic learn documentation of course one of the best ways to learn is just more practical experience it is important you research this yourself to build your knowledge base for efficient modeling let's explore some hyper parameter\n"